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NUCLEIC ACID MOLECULES ENCODING SERINE PROTEASE 16, THE ENCODED 

POLYPEPTIDES AND METHODS BASED THEREON 

RELATED APPLICATIONS 

Benefit of priority is claimed to U.S. provisional application Serial No. 
5 60/394,347, filed July 2, 2002, to Edwin L. Madison, Edgar O. Ong and Juinn- 
Chern Yeh, entitled "NUCLEIC ACID MOLECULES ENCODING SERINE 
PROTEASE 16, THE ENCODED PROTEINS AND METHODS BASED THEREON." 
Where permitted, the subject matter of each of the provisional application and 
International PCT application No. Docket No. 24745-1625, filed on the same day 
10 herewith, entitled "NUCLEIC ACID MOLECULES ENCODING SERINE PROTEASE 
16, THE ENCODED POLYPEPTIDES AND METHODS BASED THEREON", is incor- 
porated by reference in it entirety. 
FIELD OF INVENTION 

Nucleic acid molecules that encode proteases and portions thereof, 
15 particularly protease domains are provided. Also provided are prognostic, 

diagnostic and therapeutic methods using the proteases and domains thereof and 
the encoding nucleic acid molecules. 

BACKGROUND OF THE INVENTION AND OBJECTS THEREOF 

Cancer is a leading cause of death in the United States, developing in one 
20 in three Americans; one of every four Americans dies of cancer. Cancer is 

characterized by an increase in the number of abnormal neoplastic cells, which 
proliferate to form a tumor mass, the invasion of adjacent tissues by these 
neoplastic tumor cells, and the generation of malignant cells that metastasize via 
the blood or lymphatic system to regional lymph nodes and to distant sites. 
25 Among the hallmarks of cancer is a breakdown in the communication 

among tumor cells and their environment. Normal cells do not divide in the 
absence of stimulatory signals, and cease dividing in the presence of inhibitory 
signals. Growth-stimulatory and growth-inhibitory signals are routinely 
exchanged between cells within a tissue. In a cancerous, or neoplastic, state, a 
30 cell acquires the ability to "override" these signals and to proliferate under 
conditions in which normal cells do not grow. 
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In order to proliferate, tumor cells acquire a number of distinct aberrant 
traits reflecting genetic alterations. The genomes of certain well-studied tumors 
carry several different independently altered genes, including activated 
oncogenes and inactivated tumor suppressor genes. Each of these genetic 
5 changes appears to be responsible for imparting some of the traits that, in the 
aggregate, represent the full neoplastic phenotype. 

A variety of biochemical factors have been associated with different 
phases of metastasis. Cell surface receptors for collagen, glycoproteins such as 
laminin, and proteoglycans, facilitate tumor cell attachment, an important step in 

10 invasion and metastases. Attachment triggers the release of degradative 

enzymes which facilitate the penetration of tumor cells through tissue barriers. 
Once the tumor cells have entered the target tissue, specific growth factors are 
required for further proliferation. Tumor invasion and progression involves a 
complex series of events, in which tumor cells detach from the primary tumor, 

1 5 break down the normal tissue surrounding it, and migrate into a blood or 

lymphatic vessel to be carried to a distant site. Destruction and/or remodeling of 
normal tissue barriers is accomplished by the elaboration of specific enzymes 
that degrade the proteins of the extracellular matrix that make up basement 
membranes and stromal components of tissues. 

20 A class of extracellular matrix degrading enzymes has been implicated in 

tumor invasion. Among these are the matrix metalloproteinases (MMP). For 
example, the production of the matrix metalioproteinase stromeiysin is 
associated with malignant tumors with metastatic potential (see, e.g., McDonnell 
eta/. (1990) Smnrs. in Cancer Biology 7:107-115; McDonnell et af. (1990) 

25 Cancer and Metastasis Reviews S:309-31 9). 

The capacity of cancer cells to metastasize and invade tissue is facilitated 
by degradation of the basement membrane. Several proteinase enzymes, 
including the MMPs, have been reported to facilitate the process of invasion of 
tumor cells. MMPs are reported to enhance degradation of the basement 

30 membrane, which thereby permits tumorous cells to invade tissues. For 

example, two major metalloproteinases having molecular weights of about 70 
kDa and 92 kDa appear to enhance the ability of tumor cells to metastasize. 
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Serine Proteases 

Serine proteases (SPs) have been implicated in neoplastic disease 
progression. Most serine proteases, which are either secreted enzymes or are 
sequestered in cytoplasmic storage organelles, have roles in blood coagulation, 
5 wound healing, digestion, immune responses and tumor invasion and metastasis. 
A class of cell surface proteins designated type II transmembrane serine 
proteases, which are membrane-anchored proteins with additional extracellular 
domains, has been identified. As cell surface proteins, they are positioned to 
play a role in intracellular signal transduction and in mediating cell surface 

10 proteolytic events. Other serine proteases can be membrane bound and function 
in a similar manner. Others are secreted. Many serine proteases exert their 
activity upon binding to cell surface receptors, and, hence act at cell surfaces. 
Cell surface proteolysis is a mechanism for the generation of biologically active 
proteins that mediate a variety of cellular functions. 

1 5 Serine proteases, including secreted and transmembrane serine proteases, 

have been implicated in processes involved in neoplastic development and 
progression. While the precise role of these proteases has not been fully 
elaborated, serine proteases and inhibitors thereof are involved in the control of 
many intra- and extracellular physiological processes, including degradative 

20 actions in cancer cell invasion, metastatic spread, and neovascularization of 

tumors, that are involved in tumor progression. It is believed that proteases are 
involved in the degradation and remodeling of extracellular matrix (ECM) and 
contribute to tissue remodeling, and are necessary for cancer invasion and 
metastasis. The activity and/or expression of some proteases have been shown 

25 to correlate with tumor progression and development. 

For example, a membrane-type serine protease MTSP1 (also called 
matriptase; see SEQ ID Nos. 1 and 2 from U.S. Patent No. 5,972,616; and 
GenBank Accession No. AF1 18224; (1999) J. Biol. Chem. 274:18231-18236; 
U.S. Patent No. 5,792,616; see, also Takeuchi (1999) Proc. Natl. Acad. Set. 

30 U.S.A. 96'A 1054-1 161) that is expressed in epithelial cancer and normal tissue 
(Takeucuhi et al. (1999) Proc. Natl. Acad. Sci. USA 56:11054-61) has been 
identified. Matriptase was originally identified in human breast cancer cells as a 
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major gelatinase (see, U.S. Patent No. 5,482,848), a type of matrix 
metalloproteinase (MMP). It has been proposed that it plays a role in the 
metastasis of breast cancer. Matriptase also is expressed in a variety of 
epithelial tissues with high levels of activity and/or expression in the human 
5 gastrointestinal tract and the prostate. MTSPs, designated MTSP3, MTSP4, 
MTSP6 have been described in published International PCT application No. WO 
01/57194, based in International PCT application No. PCT/US01 /03471 . 

Prostate-specific antigen (PSA), a kallikrein-Iike serine protease, degrades 
extracellular matrix glycoproteins fibronectin and laminin, and, has been 

10 postulated to facilitate invasion by prostate cancer cells (Webber eta/. (1995) 
Clin. Cancer Res. 7:1089-94). Blocking PSA proteolytic activity with 
PSA-specific monoclonal antibodies results in a dose-dependent decrease in vitro 
in the invasion of the reconstituted basement membrane Matrigel by LNCaP 
human prostate carcinoma cells which secrete high levels of PSA. 

15 Hepsin, a cell surface serine protease identified in hepatoma cells, is 

overexpressed in ovarian cancer (Tanimoto et al. (1997) Cancer Res., 
57):2884-7). The hepsin transcript appears to be abundant in carcinoma tissue 
and is almost never expressed in normal adult tissue, including normal ovary. It 
has been suggested that hepsin is frequently overexpressed in ovarian tumors 

20 and therefore can be a candidate protease in the invasive process and growth 
capacity of ovarian tumor cells. 

A serine protease-like gene, designated normal epithelial cell-specific 1 
(NES1) (Liu etal. Cancer Res. 55:3371-3379 (1996)) has been identified. 
Although expression of the NES1 mRNA is observed in all normal and 

25 immortalized nontumorigenic epithelial cell lines, the majority of human breast 
cancer cell lines show a drastic reduction or a complete lack of its expression, 
the structural similarity of NES1 to polypeptides known to regulate growth 
factor activity and a negative correlation of NES1 expression with breast 
oncogenesis suggest a direct or indirect role for this protease-like gene product 

30 in the suppression of tumorigenesis. 

Hence transmembrane and other serine proteases and other proteases 
appear to be involved in the etiology and pathogenesis of tumors. There is a 
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need to further elucidate their role in these processes and to identify additional 
transmembrane proteases. Therefore, among the objects herein, it is an object 
herein to provide serine protease proteins and nucleic acids encoding such 
proteases that are involved in the regulation of or participate in tumorigenesis 
5 and/or carcinogenesis. It also is an object herein to provide prognostic, 

diagnostic, therapeutic screening methods using the such proteases and the 
nucleic acids encoding such proteases. 
SUMMARY OF THE INVENTION 

Provided herein are proteins designated CVSP16 and protease domains 

10 thereof. CVSP16 is a member of the serine protease family and is expressed or 
active in breast, colon, lung, prostate, kidney, stomach, spleen, thyroid gland, 
trachea and pituitary gland and in tumor tissues and cancers, including colon, 
breast and prostate cancers, and in leukemias and lymphomas. Hence, as a 
protease it can be involved in tumor progression. By virtue of its functional 

1 5 activity it can be a therapeutic or diagnostic target. The expression and/or 
activation (or reduction in level of expression or activation) of the expressed 
protein (zymogen) of the this protein can be used to monitor cancer and cancer 
therapy. 

The serine protease family includes members that are activated and/or 
20 expressed in tumor cells at different levels from non-tumor cells; and those from 
cells in which substrates therefor differ in tumor cells from non-tumor cells or 
otherwise alter the specificity of the serine protease (SP). The serine protease 
provided herein, designated herein as CVSP16, is a secreted protease. Protease 
domains and full-length protein, including the zymogen or inactive forms and 
25 activated forms, and uses thereof are also provided. Proteins encoded by splice 
variants are also provided. Nucleic acid molecules encoding the proteins and 
protease domains are also provided. 

CVSP1 6 polypeptide is expressed as a secreted protein and may bind to 
cell surface receptors to function as a cell-surface bound protease, such as by 
30 binding thereto or by dimerization or multimerization with a membrane-bound or 
receptor-bound protein. CVSP1 6 polypeptides are serine proteases and exhibit 
catalytic activity and also can exhibit substrate and ligand binding activity. The 
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CVSP16 proteases provided herein do not include a contiguous sequence of at 5 
least amino acids from SEQ ID No. 21 inserted between residues corresponding 
to Qego and M 6Q1 and/or the CVSP16 proteases contains at least two protease 
domains of a serine protease 16 (CVSP16) and includes at least 5 contiguous 
5 amino acids corresponding to residues 508-544 of SEQ ID No. 6. or contains the 
contiguous sequence Asn Asp Ser or Trp Asn Asp or Ser Cys Trp Asn Asp Ser 
or Cys Trp Asn Asp Ser or Gin Thr His or Leu Gin Thr His in the second protease 
domain. 

An exemplified CVSP16 (see, e.g., SEQ ID Nos. 5 and 6, which set forth 

10 nucleic acid and amino acid sequences thereof), includes two protease domains, 
designated PD1 and PD2, respectively (see, e.g., PD1 amino acids 46 to 286 
(which includes the R that is cleaved upon activation cleavage), and PD2 
including amino acids 323 or 324 or 325 or 326 to 550 of SEQ ID No. 6). The 
exemplified CVSP16 contains a signal peptide sequence {e.g., aa 1 to aa 23 of 

15 SEQ ID No. 6) and a first trypsin-like serine protease domain designated herein 
as CVSP16 PD1 characterized by the presence of a protease activation cleavage 
site (...R 46 ±I 47 VGGSNAQP..., where 4 indicates protease activation cleavage 
site) at the beginning of the domain and the catalytic triad residues (H 87 , D, 39 and 
S 243 ) in 3 highly-conserved regions of the catalytic domain. 

20 In addition CVSP16 has a second protease domain PD2, in which the 

catalytic histidine is replaced by a serine, indicating that the second protease 
domain has lower catalytic activity. The isolated protease domains as single 
chains are provided as are polypeptides that include such protease domains. In 
particular, a polypeptide that contains PD1 as the only CVSP16 portion is 

25 provided. Also provided polyepeptides that include PD1 and/or PD2, but do not 
include at least 5 contiguous amino acids from SEQ ID No. 21 and polypeptides 
that include PD1 or PD2, particularly polypeptides that include PD1 and/or PD2 
as the only CVSP1 6 portion. Also included are CVSP1 6 proteases that contain 
at least two protease domains of a serine protease 1 6 (CVSP1 6) and include at 

30 least 5 contiguous amino acids corresponding to residues 508-544 of SEQ ID 
No. 6. or contains the contiguous sequence Asn Asp Ser or Trp Asn Asp or Ser 



WO 2004/005471 



PCT/US2003/020959 



Cys Trp Asn Asp Ser or Cys Trp Asn Asp Ser or Gin Thr His or Leu Gin Thr His 

in the second protease domain. 

Isolated PD1 and/or PD2 are provided as single-chain molecules are as 
activated one, two or three chain molecules. Each of PD1 or PD2 can exhibit 
5 functional activity (catalytic, substrate binding and/or ligand binding activity) 
without undergoing activation cleavage and/or as single chains. 

Nucleic acid encoding the CVSP1 6 protease and upstream nucleic acid is 
set forth in SEQ ID No. 5; and the encoded protein is set forth in SEQ ID No. 6. 
The protease domains for use in the methods and assays provided herein 

1 0 do not have to result from activation, which produces a two or multi-chain 
activated product, but can be a single-chain polypeptide. Such polypeptides, 
although not the result of activation and not two-chain forms, can exhibit 
proteolytic (catalytic) activity. These protease domain polypeptides are used in 
assays to screen for agents that modulate an activity of the CVSP16. 

15 Such assays are also provided herein. In exemplary assays, the effects of 

test compounds on the ability of the full length-single chain, multiple chain 
activated forms, or a protease domain, which is a single chain or a double chain 
activated form, of CVSP16 to proteolytically cleave a known substrate, typically 
a fluorescently, chromogenically or otherwise detectably labeled substrate, are 

20 assessed. Agents, generally compounds, particularly small molecules, that 

modulate an activity of the polypeptide (full length or protease domain(s) either 
single or double chain or multi-chain forms thereof) are candidate compounds for 
modulating an activity of a CVSP1 6. The protease domains and full length 
proteins also can be used to produce that bind to CVSP1 6s provided herein as 

25 well as single-chain protease-specific antibodies and/or multi-chain specific 
antibodies. 

The protease domains provided herein include, but are not limited to, a 
single chain region having an N-terminus at a cleavage site for activation of a 
zymogen form, through the C-terminus, or C-terminal truncated portions thereof 
30 that exhibit proteolytic activity as a single-chain polypeptide in in vitro 

proteolysis assays of any CVSP16 provided herein, from a mammal, including a 
human, that, for example, is expressed in tumor cells at different levels from 
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non-tumor cells. Such protease domains include single- chains PD1 and PD2 
each having an N-terminus that corresponds to that resulting from activation 
cleavage {i.e. corresponding to amino acids 47 and 324, respectively of SEQ ID 
No. 6). Also provided are truncated versions thereof, particularly, C-terminal 
5 truncated versions, that exhibit proteolytic and/or substrate binding and/or ligand 
binding activity. 

Also provided are muteins of a single-chain protease domain of CVSP16 
particularly muteins in which a Cys residue (corresponding to residue no. 159) in 
a protease domain that is unpaired (i.e., free; does not form disulfide linkages 

1 0 with any other Cys residue within the same protease domain) is substituted with 
another amino acid substitution, generally with a conservative amino acid 
substitution or a substitution that does not eliminate an activity, and muteins in 
which a glycosylation site(s) is eliminated. Similarly, an unpaired cysteine 
(corresponding to residue no. 430 in SEQ ID No. 6, a residue that does not form 

1 5 disulfide linkages with any other Cys residue within the second protease domain) 
similarly can be substituted. Muteins in which other conservative amino acid 
substitutions in which catalytic activity or other functional activity is retained are 
also contemplated (see, e.g., Table 1, for exemplary amino acid substitutions). 
Hence, provided herein are members of the family of serine proteases 

20 designated CVSP1 6, and functional domains, especially protease (or catalytic) 
domains thereof, muteins and other derivatives and analogs thereof. Also 
provided herein are nucleic acids encoding the CVSP16. 

As noted, the nucleic acid and amino acid sequences of an exemplary 
CVSP1 6 are set forth in SEQ ID Nos. 5 and 6. Molecules with single or a 

25 plurality of amino acids insertions, deletions or substitutions are provides. 
Nucleic acid molecules that encode a single-chain protease domain or 
catalytically active portion thereof and also those that encode the full-length 
CVSP1 6 are provided. Also provided are nucleic acid molecules that hybridize to 
such CVSP16 encoding nucleic acid along at least about 70%, 80%, 90%, 95% 

30 or more of their full length and encode a protease domain or portion thereof. 

Hybridization is typically performed under conditions of at least low, generally at 
least moderate, and often high stringency; generally the hybridizing nucleic acid 
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hybridizes along at least about 70%, 80%, 90%. 95% of its full length at the 

recited stringency. 

Additionally provided herein are antibodies that specifically bind to the 
CVSP16, particularly the CVSP16s provided, including CVSP1 6s provided herein 
5 that does not include at least 5, 7, 10, 1 5, 20 or more contiguous amino acids 
from SEQ ID No. 21- Also provided are antibodies that specifically bind to 
CVSP16 proteases that contain at least two protease domains of a serine 
protease 1 6 (CVSP1 6) and include at least 5 contiguous amino acids 
corresponding to residues 508-544 of SEQ ID No. 6. or contains the contiguous 
1 0 sequence Asn Asp Ser or Trp Asn Asp or Ser Cys Trp Asn Asp Ser or Cys Trp 
Asn Asp Ser or Gin Thr His or Leu Gin Thr His in the second protease domain. 

Included are antibodies that specifically bind to the protein or protease 
domain, including to the single and/or double chain forms thereof. Among the 
antibodies are two-chain-specific antibodies, and single-specific antibodies and 
15 neutralizing antibodies that inhibit functional activity (i.e., catalytic activity 

and/or substrate or binding activity). Also provided are antibodies that bind with 
at least 2-fold, 5-fold, 10-fold or 100-fold greater affinity to CVSP16 
polypeptides that do not include a contiguous portion (5, 7, 10, 15, 20 or more 
amino acid residues) of the sequence of amino acids set forth in SEQ ID No. 21 
20 compared to a CVSP polypeptide that includes SEQ ID No. 21 . Typically the 

CVSP1 6 polypeptides that do not include the sequence of amino acids set forth 
in SEQ ID No. 21 do not include it between residues corresponding to Q^o and 
M 6Q1 (see, SEQ ID No. 6) and/or the CVSP16s polypeptides contain at least two 
protease domains of a serine protease 16 (CVSP16) and include at least 5 
25 contiguous amino acids corresponding to residues 508-544 of SEQ ID No. 6. or 
contains the contiguous sequence Asn Asp Ser or Trp Asn Asp or Ser Cys Trp 
Asn Asp Ser or Cys Trp Asn Asp Ser or Gin Thr His or Leu Gin Thr His in the 

second protease domain.. 

CVSP16 polypeptides, including, but not limited to, splice variants 
30 thereof, domains, derivatives and analogs thereof are provided herein. Single- 
chain protease domains, where the N-terminal is that which would be generated 
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by activation of the zymogen form are provided. Multi-chain, generally two- or three 
chain activated forms of CVSP16 also are provided. 

Also provided are cells, combinations, kits and articles of manufacture 
that contain the nucleic acid encoding the CVSP16 and/or the CVSP16. Further 
5 provided herein are prognostic, diagnostic, therapeutic screening methods using 
CVSP16 and the nucleic acids encoding CVSP16. Also provided are transgenic 
non-human animals bearing inactivated genes encoding CVSP16 and bearing 
the genes (or inserted cDNA) encoding the CVSP1 6, particularly under a non- 
native promoter control or on an exogenous element, such as a plasmid or 

10 artificial chromosome, are additionally provided herein. Also provided are nucleic 
acid molecules encoding each of CVSP1 6 and domains thereof. 

Also provided are plasmids containing any of the nucleic acid molecules 
provided herein. Cells containing the plasmids are also provided. Such cells 
include, but are not limited to, bacterial cells, yeast cells, fungal cells, plant cells, 

1 5 insect cells and animal cells. 

Also provided is a method of producing CVSP16 by growing the above- 
described cells under conditions whereby the CVSP1 6 is expressed by the cells, 
and recovering the expressed CVSP16 polypeptide. Methods for isolating 
nucleic acid encoding other CVSP16s are also provided. 

20 Also provided are cells, generally eukaryotic cells, such as mammalian 

cells and yeast cells, in which the CVSP1 6 polypeptide is expressed by the cells, 
particularly under conditions in which it is secreted. Also provided are cells to 
which the CVSP1 6 polypeptide or a protease domain thereof is bound. Such 
cells are used in drug screening assays to identify compounds that modulate an 

25 activity of the CVSP16 polypeptide. These assays include in vitro binding 

assays, and transcription based assays in which signal transduction mediated by 
the bound CVSP1 6 is assessed. 

The CVSP16 polypeptides (including those that include all or a portion of 
SEQ ID No. 21) are of interest for a variety of reasons. For example, they 

30 appear to be expressed and/or activated at different levels in tumor cells from 
normal cells, or to have functional activity that is different in tumor cells from 
normal cells, such as by an alteration in a substrate therefor, or a cofactor. The 
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CVSP1 6 polypeptides (including those that include all or a portion of SEQ ID No. 
21) are of interest for a variety of reasons. For example, they appear to be 
expressed and/or activated at different levels in tumor cells from normal cells, or 
to have functional activity that is different in tumor cells from normal cells, such 
5 as by an alteration in a substrate therefor, or a cofactor. 

It is shown herein, that CVSP1 6 is expressed in cervical cancer. It also 
may be expressed in colon, breast, stomach, uterine, ovarian, lung and prostate 
tumors and in other tumors as well as in certain normal cells and tissues (see 
e.g., EXAMPLES for an exemplary tissue-specific expression profile). The 

10 expression and/or activation thereof and/or its presence above a predetermined 
among is a body fluid can be diagnostic or prognostic of cancer. 

Hence the CVSP1 6 polypeptides provided herein can serve as diagnostic 
markers for certain tumors, particularly cervical cancers. The level of activated 
CVSP16 can be diagnostic of uterine, pancreatic, breast, lung, stomach, prostate 

1 5 or colon cancer or leukemia or other cancer. 

Also provided herein are methods of modulating an activity of a CVSP16 
and screening for compounds that modulate, including inhibit, antagonize, 
agonize or otherwise alter an activity of the CVSP16. Of particular interest is 
the proteolytic (catalytic) portion of the protein. 

20 Thus, provided herein are prognostic, diagnostic and therapeutic 

screening methods using the CVSP16 and the nucleic acids encoding CVSP16. 
In particular, the prognostic, diagnostic and therapeutic screening methods are 
used for preventing, treating, or for finding agents useful in preventing or 
treating, tumors or cancers such as uterine, stomach, lung carcinoma, colon 

25 carcinoma and other cancers. 

Methods of diagnosing a disease or disorder characterized by detecting an 
aberrant level of a CVSP16 in a subject is provided. The method can be 
practiced by measuring the level of the DNA, RNA, protein or functional activity 
of the CVSP1 6. An increase or decrease in the level of the DNA, RNA, protein 

30 or functional activity of the CVSP1 6, relative to the level of the DNA, RNA, 

protein or functional activity or any other suitable control, found in an analogous 
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sample not having the disease or disorder (or other suitable control) is indicative 

of the presence of the disease or disorder in the subject. 

Also provided are methods for screening for compounds that modulate an 

activity of CVSP16. The compounds are identified by contacting them with the 
5 CVSP16 or protease domain thereof and a substrate for the CVSP16. A change 

in the amount of substrate cleaved in the presence of the compounds and 

compared to that in the absence of the compound indicates that the compound 

modulates an activity of a CVSP1 6. Such compounds are selected for further 

analyses or for use to modulate the activity of the CVSP1 6, such as inhibitors or 
10 agonists. The compounds also can be identified by contacting the substrates 

with a cell that expresses the CVSP16 or with the CVSP1 6 or a proteolytically 

active portion thereof. 

Computer based screening methods are also provided. In these methods, 

interactions between simulated test compounds computer simulated CVSP1 6 
15 polypeptides are assessed, such as by computational docking or binding studies. 

Test compounds predicted to bind or otherwise interact with a CVSP16 

polypeptide are selected as drug candidates. Further characterization and study, 

such as in vitro assays, can be performed. 

20 Also provided herein are modulators of the activity of CVSP1 6, especially 

the modulators (i.e., antagonists, agonists, inhibitors), including antibodies and 
RNA molecules, and molecules obtained according to the screening methods 
provide herein. Such modulators can have use in treating cancerous conditions, 
and other neoplastic conditions. 

25 Pharmaceutical compositions containing a protease domain and/or full- 

length or two protease domains or other domain of a CVSP16 polypeptide are 
provided herein in a pharmaceutical^ acceptable carrier or excipient are provided 
herein. 

Also provided are articles of manufacture that contain CVSP16 
30 polypeptide and/or a protease domain or protease domains of a CVSP1 6 as 
single-chain zymogen and activated forms and as full-length form zymogen or 
activated forms and other forms thereof. The articles contain a) packaging 
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material; b) a CVSP16 polypeptide (or encoding nucleic acid), particularly a 
polypeptide containing a single chain protease domain thereof; and c) a label 
indicating that the article is for use in assays for identifying modulators of the 
activities of a CVSP16 polypeptide or for use in diagnostic assays. 
5 Conjugates containing a) a CVSP16 polypeptide or protease domain in 

single chain from; and b) a targeting agent linked to the SP directly or via a 
linker, wherein the agent facilitates: i) affinity isolation or purification of the 
conjugate; ii) attachment of the conjugate to a surface; iii) detection of the 
conjugate; or iv) targeted delivery to a selected tissue or cell, is provided herein. 

10 The conjugate can contain a plurality of agents linked thereto. The conjugate 
can be a chemical conjugate; and it can be a fusion protein. Targeting agents 
include proteins and peptide fragments. The protein or peptide fragment can 
include a protein binding sequence, a nucleic acid binding sequence, a lipid 
binding sequence, a polysaccharide binding sequence, or a metal binding 

1 5 sequence. 

Combinations are provided herein. The combination can include: a) an 
inhibitor of an activity of a CVSP1 6; and b) an anti-cancer treatment or agent. 
The SP inhibitor and the anti-cancer agent can be formulated in a single 
pharmaceutical composition or each is formulated in a separate pharmaceutical 

20 composition. The CVSP1 6 inhibitor can be an antibody or a fragment or binding 
portion thereof made against the CVSP16, such as an antibody that specifically 
binds to a protease domain, an inhibitor of CVSP1 6 production, or an inhibitor of 
CVSP16 membrane-localization or an inhibitor of CVSP16 activation. Other 
CVSP1 6 inhibitors include, but are not limited to, an antisense nucleic acid 

25 encoding the CVSP1 6, particularly a portion of a protease domain, a nucleic acid 
encoding at least a portion of a gene encoding the CVSP16 with a heterologous 
nucleotide sequence inserted therein such that the heterologous sequence 
inactivates the biological activity of the encoded CVSP16 or the gene encoding 
it. The portion of the gene encoding the CVSP16 typically will flank the 

30 heterologous sequence to promote homologous recombination with a genomic 
gene encoding the CVSP16. 
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Also, provided are methods for treating or preventing a tumor or cancer in 
a mammal by administering to a mammal an effective amount of an inhibitor of a 
CVSP1 6, whereby the tumor or cancer is treated or prevented. The CVSP1 6 
inhibitor used in the treatment or for prophylaxis is administered with a 
5 pharmaceutical^ acceptable carrier or excipient. The mammal treated can be a 
human. The treatment or prevention method can additionally include 
administering an anti-cancer treatment or agent simultaneously with or 
subsequently or before administration of the CVSP16 inhibitor. 

Also provided is a recombinant non-human animal in which an 

10 endogenous gene of a CVSP16 has been deleted or inactivated by homologous 
recombination or insertional mutagenesis of the animal or an ancestor thereof. A 
recombinant non-human animal is provided herein, where the gene of a CVSP16 
is under control of a promoter that is not the native promoter of the gene or that 
is not the native promoter of the gene in the non-human animal or where the 

1 5 nucleic acid encoding the CVSP1 6 is heterologous to the non-human animal and 
the promoter is the native or a non-native promoter or the CVSP16 is on an 
extrachromosomal element, such as a plasmid or artificial chromosome. 

Also provided are methods of treatments of tumors by administering a 
prodrug that is activated by CVSP1 6 that is expressed or active in tumor cells, 

20 particularly those in which its functional activity in tumor cells is greater than in 
non-tumor cells. The prodrug is administered and, upon administration, active 
CVSP1 6 expressed on cells cleaves the prodrug and releases active drug in the 
vicinity of these cells. The active anti-cancer drug accumulates in the vicinity of 
the tumor. This is particularly useful in instances in which CVSP16 is expressed 

25 or active in greater quantity, higher level or predominantly in tumor cells 
compared to other cells. 

Also provided are methods of identifying a compound that binds to a form 
of CVSP1 6, by contacting a test compound with two or more forms; determining 
to which form or forms the compound binds; and if it binds to a form of 

30 CVSP16, further determining whether the compound has at least one of the 

following properties: 

(i) inhibits activation of a single-chain zymogen form of CVSP1 6; 
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(ii) inhibits activity of a form; and/or 

(iii) inhibits dimerization of the protein. 

The forms can be full length forms zymogen or activated forms, or a a single or 
multi-chain protease domain or two protease domains or other form resulting 
5 from cleavage at an activation site. 

Also provided are methods of diagnosing the presence of a pre-malignant 
lesion, a malignancy, or other pathologic condition in a subject, by obtaining a 
biological sample from the subject; exposing it to a detectable agent that binds 
to a multi-chain (/.©., two- or three- chain) or single-chain form of CVSP16, 

10 where the pathological condition is characterized by the presence or absence or 
amount of a three-chain, two-chain or single-chain form . 

Methods of inhibiting tumor invasion or metastasis or treating a malignant 
or pre-malignant condition by administering an agent that inhibits activation of a 
zymogen form of CVSP16 or an activity of an activated form are provided. The 

1 5 conditions include, but are not limited to, a condition, such as a tumor of the 
uterus, stomach and also the breast, cervix, prostate, esophagus, lung, ovary 
and colon. 

Methods for monitoring tumor progression and/or therapeutic 
effectiveness are also provided. The levels of activation or expression of 

20 CVSP1 6 or a protease domain or domains thereof are assessed, and the change 
in the level, reflects tumor progression and/or the effectiveness of therapy. 
Generally, as the tumor progresses the amount of C VSP1 6 in a body tissue or 
fluid sample increases; effective therapy reduces the level. 
DETAILED DESCRIPTION 

25 A. DEFINITIONS 

Unless defined otherwise, all technical and scientific terms used herein 
have the same meaning as is commonly understood by one of skill in the art to 
which the invention(s) belong. All patents, patent applications, published 
applications and publications, Genbank sequences, databases, websites and 

30 other published materials referred to throughout the entire disclosure herein, 
unless noted otherwise, are incorporated by reference in their entirety. In the 
event that there are a plurality of definitions for terms herein, those in this 
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section prevail. Where reference is made to a URL or other such identifier or 
address, it understood that such identifiers can change and particular information 
on the internet can come and go, but equivalent information can be found by 
searching the internet. Reference thereto evidences the availability and public 
5 dissemination of such information. 

As used herein, the abbreviations for any protective groups, amino acids 
and other compounds, are, unless indicated otherwise, in accord with their 
common usage, recognized abbreviations, or the IUPAC-IUB Commission on 
Biochemical Nomenclature (see, {1 972) Biochem. 7 7:942-944). 

10 As used herein, protease refers to an enzyme catalyzing hydrolysis of 

proteins or peptides. It includes zymogen forms and activated single-, two- and 
three-chain forms thereof. For clarity reference to protease refers to all forms, 
and particular forms will be specifically designated. 

As used herein, serine protease refers to a diverse family of proteases 

15 wherein a serine residue is involved in the hydrolysis of proteins or peptides. 

The serine residue can be part of the catalytic triad mechanism, which includes a 
serine, a histidine and an aspartic acid in the catalysis, or be part of the 
hydroxyl/e-amine or hydroxyl/a-amine catalytic dyad mechanism, which involves 
a serine and a lysine in the catalysis. Of particular interest are SPs of 

20 mammalian, including human, origin. Those of skill in this art recognize that, in 
general, single amino acid substitutions in non-essential regions of a polypeptide 
do not substantially alter biological activity (see, e.g., Watson et aL (1987) 
Molecular Biology of the Gene, 4th Edition, The Benjamin/Cummings Pub. co., 
p. 224). 

25 As used herein, "transmembrane serine protease (MTSP)" refers to a 

family of transmembrane serine proteases that share common structural features 
as described herein (see, also Hooper et al. (2001) J. Biol. Chem.276:857-860). 
Thus, reference, for example, to n MTSP" encompasses all proteins encoded by 
the MTSP gene family, including but are not limited to: MTSP3, MTSP4, 

30 MTSP6, MTSP7, MTSP9, MTSP10, MTSP20 or an equivalent molecule obtained 
from any other source or that has been prepared synthetically or that exhibits 
the same activity. Other MTSPs include, but are not limited to, corin, 
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enteropeptidase, human airway trypsin-like protease (HAT), MTSP1 , TMPRSS2 
and TMPRSS4. Sequences of encoding nucleic acid molecules and the encoded 
amino acid sequences of exemplary MTSPs and/or domains thereof are set forth, 
for example in U.S. application Serial No. 09/776,191 (SEQ ID Nos. 1-12, 49, 
5 50 and 61-72 therein, published as International PCT application No. WO 

01/57194; see also published International PCT application Nos. WO 02/072786 
and WO 02/977267, and International PCT application Nos. PCT/US02/21 208 
and PCT/US02/1 5332). The term also encompass MTSPs with amino acid 
substitutions that do not substantially alter activity of each member and also 

10 encompasses splice variants thereof. Suitable substitutions, including, although 
not necessarily, conservative substitutions of amino acids, are known to those of 
skill in this art and can be made without eliminating a biological activity, such as 
the catalytic activity, of the resulting molecule. 

As used herein, Type I MTSP refers to transmembrane proteins made with 

15 an N-terminal signal peptide that is cleaved so that the new N-terminus is on the 
extracytoplasmic side of the membrane. The original N-terminus likely stays on 
the cytoplasmic side, and cleavage occurs on the other side of the membrane. 
These proteins are anchored through a C-terminal membrane-spanning segment. 
As used herein, Type II MTSP refers to transmembrane proteins that are 

20 synthesized with N-terminal or internal signal peptides that are not cleaved and 
that serve as a membrane anchor. 

As used herein, a "protease domain of a CVSP M , particularly CVSP1 6, 
refers to a domain of a SP that exhibits proteolytic activity and shares homology 
and structural features with the chymotrypsin/trypsin family protease domains. 

25 Hence it is at least the minimal portion of the domain that exhibits proteolytic 
activity (or other functional activity, such as ligand or substrate binding) as 
assessed by standard in vitro assays. Those of skill in this art recognize that a 
protease domain is the portion of the protease that is structurally equivalent to 
the trypsin or chymotrypsin fold. Contemplated herein are polypeptides that 

30 include such protease domains and catalytically active portions thereof. Also 
provided are truncated forms of a protease domain that include the smallest 
fragment thereof that acts catalytically as a single-chain form. 
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Thus, as used herein a protease domain of a CVSP16, whenever 
referenced herein, includes at least one or all of or any combination of or a 
catalyticaily active portion of a CVSP16 polypeptide as defined herein. A 
protease domain of a CVSP16 protein refers to a protease domain of a CVSP16 
5 that exhibits serine proteolytic activity. Hence it is at least a minimal portion of 
the protein that exhibits proteolytic activity (or binds to a substrate or ligand) as 
a single-chain form or as an activated form, as assessed by standard assays in 
vitro. It refers, herein, to a single chain form and also multi-chain activated 
forms. Exemplary protease domains include at least a sufficient portion of a 

10 sequence of amino acids set forth in SEQ ID No. 6 (encoded by nucleotides in 
SEQ ID No. 5) to exhibit protease activity. 

Protease domains of CVSPs vary in size and constitution, including 
insertions and deletions in surface loops. They retain conserved structure, 
including at least one of the active site triad, primary specificity pocket, 

15 oxyanion hole and/or other features of serine protease domains of proteases. 

For purposes herein, a protease domain is a single chain portion of a CVSP16, as 
defined herein, but is homologous in its structural features and retention of 
sequence of similarity or homology to the protease domain of chymotrypsin or 
trypsin. Smaller portions of the protease domains, particularly the single chain 

20 domains, thereof that retain protease activity are contemplated. Such smaller 
versions will generally be C-terminal truncated versions of the protease domains. 
. Multi-chain, typically two-chain forms, resulting from activation cleavage are also 
contemplated. The polypeptide can exhibit proteolytic activity as a single chain. 
Thus, for purposes herein, a protease domain is a portion of a CVSP16, 

25 as defined herein, and shares common features and homologies with protease 
domains of other SPs. As with the larger class of enzymes of the chymotrypsin 
(S1) fold (see, e.g., Internet accessible MEROPS database), the CVSP16 
protease domains share a high degree of amino acid sequence identity. The His, 
Asp and Ser residues necessary for activity are present in conserved motifs in 

30 PD1 and in modified form in PD2. An activation site, whose cleavage creates 

the N-terminus of the protease domain in two-chain forms has a conserved motif 
and location relative to other sites in the domain and can be identified. Domains 
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that include some of these conserved features, such as the His r but that exhibit 
any proteolytic activity are considered protease domains herein. An activation 
site, which results in the N-terminus of a second chain in the multi-chain forms 
has a conserved motif and/or location within the polypeptide. 
5 Exemplary of a protease domain of a CVSP1 6 is a polypeptide set forth 

as amino acids 46 to 286 or including amino acids 326 to 550, particularly 323- 
550, of SEQ ID No. 6, which can provided as a single-chain isolated molecule or 
as molecule that contains both domains but not the full-length CVSP16. The 
single-chain form can be a zymogen or activated form in which the N-terminus 

10 corresponds to the N-terminus produced by activation cleavage 

As used herein, the catalytically active domain of CVSP16 refers to the 
protease domain. Reference to a protease domain of a CVSP16 includes the 
single, two- and multi-chain forms of any of these proteins. A zymogen form of 
each protein is an inactive single chain form, which can be converted to an 

1 5 active multi-chain form by cleavage. A protease domain also can be converted 

to a multi-chain form. 

Significantly, at least in vitro, the single chain forms of the SPs and 
catalytic domains or proteolytically active portions thereof (typically C-terminal 
truncations) thereof exhibit protease activity. Hence provided herein are isolated 

20 single chain forms of protease domains of SPs and their use in in vitro drug 

screening assays for identification of agents that modulate an activity thereof. 

For the protease domains, residues at the N-terminus can be critical for 
activity, since it has been shown that the free amino group at the N-terminus of 
such proteases is essential for formation of the catalytically active conformation 

25 upon activation cleavage of the zymogen form of the protease. A protease 
domain of the single chain form of the CVSP1 6 protease can be catalytically 
active. Hence a protease domain generally requires N-terminal amino acids; the 
C-terminal portion can be truncated. The amount that can be removed can be 
determined empirically by testing the protein for protease activity in an in vitro 

30 assay that assesses catalytic cleavage* 

Thus, as used herein, the catalytically active domain of a CVSP refers to 
a protease domain. Typically this refers to a single-chain form of a protease 
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domain. Multi-chain, particularly two chain forms of a protease domain also are 
contemplated; such domains are necessarily longer since they include one or 
more amino acids on the N-terminus side of the activation cleavage site as a 
second chain. A zymogen form of each protease domain and the protease is a 
5 single chain, which is converted to an activated, typically a multi-chain form, 
such as a two or a three chain form or an active two or three- chain form by 
activation cleavage of one or both protease domains. Single-chain forms with 
the N-terminus that corresponds to the activation cleavage site can be active in 
vitro and are provided. 
10 As used herein, by active form is meant a form active in vivo and/or in 

vitro. Single chain forms of the SPs and the catalytic domains or proteolytically 
active portions thereof (typically C-terminal truncations) 

exhibit protease activity. For example, a polypeptide containing the protease 
domain can exist as an activated three-chain, two-chain or a single chain active 

15 form. The active single chain, two chain and three-chain forms of a CVSP16 
and catalytic domains or proteolytically active portions thereof can exhibit 
protease activity. Among the polypeptides provided herein, are isolated single- 
chain forms, two-chain and three-chain forms of CVSP16 polypeptides that 
include protease domains and their use, for example, in in vitro drug screening 

20 assays for identification of agents that modulate the activity thereof. 

There are two types of single-chain forms; active and zymogen forms. 
When cleaved at the activation site, there only one single chain active forms. 
Any form that results from activation cleavage or that has an N-terminus at a 
site that is the site of cleavage is an active single-chain forms. The N-terminus 

25 of an activated form must result from activation cleavage and be at that site. 
Any other single-chain form, which has a A different N-terminus is a zymogen. 

As used herein, a CVSP1 6 whenever referenced herein, includes at least 
one or all of or any combination of: 

a) a polypeptide encoded by the sequence of nucleotides set forth 

30 in SEQ ID No. 5; 

b) a polypeptide encoded by a sequence of nucleotides that 
hybridizes under conditions of low, moderate or high stringency to the sequence 
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of nucleotides set forth in SEQ ID No. 5, wherein the encoded polypeptide does 
not include at least 5 or at least 6 or at least 7 or at least 8 contiguous or at 
least 9 or at least 10 contiguous amino acids from SEQ ID No. 21 up to all of 
SEQ ID No. 21, and particularly do not include any amino acids therefrom 
5 between amino acids corresponding to Gln 6Q0 and Met 661 of SEQ ID No. 6; 

c) a polypeptide that contains the sequence of amino acids set 
forth in SEQ ID No. 6; 

d) a polypeptide that contains at least two protease domains of a 
serine protease 16 (CVSP16) and includes at least 5 contiguous amino acids 

10 corresponding to residues 508-544 of SEQ ID No. 6. or contains the contiguous 
sequence Asn Asp Ser or Trp Asn Asp or Ser Cys Trp Asn Asp Ser or Cys Trp 
Asn Asp Ser or Gin Thr His or Leu Gin Thr His in the second protease domain; 

e) a polypeptide that contains a sequence of amino acids having at 
least about 60%, 70%, 80%, 90% or about 95% sequence identity with the 

15 sequence of amino acids set forth in SEQ ID No. 6, wherein the polypeptide does 
not include at least 5 or at least 6 or at least 7 or at least 8 contiguous or at 
least 9 or at least 10 contiguous amino acids from SEQ ID No. 21 up to all of 
SEQ ID No. 21, and particularly does not include any amino acids therefrom 
between amino acids corresponding to Gln 660 and Met 661 of SEQ ID No. 6; 

20 f) the polypeptide is encoded by a sequence of nucleotides 

that hybridizes under conditions of at least moderate, and can be high, 
stringency along at least 70% of its full length to a sequence of nucleotides than 
encodes a polypeptide of any of a)-e), wherein the polypeptide does not include 
at least 5 or at least 6 or at least 7 or at least 8 contiguous or at least 9 or at 

25 least 10 contiguous amino acids from SEQ ID No. 21 up to all of SEQ ID No. 
21, and particularly do not include any amino acids therefrom between amino 
acids corresponding to Gln 660 and Met 661 of SEQ ID No. 6; 

g) the polypeptide has at least 60%, 70%, 80%, 90% or 
about 95% sequence identity with a polypeptide of any of a)-f), wherein the 

30 polypeptide does not include at least 5 or at least 6 or at least 7 or at least 8 

contiguous or at least 9 or at least 10 contiguous amino acids from SEQ ID No. 
21 up to all of SEQ ID No. 21, and particularly do not include any amino acids 
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therefrom between amino acids corresponding to Gln 660 and Met 661 of SEQ ID 
No. 6; and/or 

h) a polypeptide encoded by a splice variant of a sequence of 
nucleotides that encodes a CVSP16 polypeptide provided herein. Smaller 
5 portions thereof that retain protease activity are contemplated. 

An exemplary CVSP16 polypeptide includes the sequence of amino acids 
set forth in SEQ ID No. 6; the protease domains are included in the sequence of 
amino acids set forth in SEQ ID No. 6. Smaller portions thereof that retain 
protease activity are contemplated. 
10 Included among the CVSP1 6 polypeptides, as provided herein are those 

that include a protease domain of serine protease 16 (CVSP16) or a catalytically 
active portion thereof, where: 

the polypeptide does not include at least 5 or at least 6 or at least 7 or at 
least 8 contiguous or at least 9 or at least 10 contiguous amino acids from SEQ 
15 ID No. 21, and particularly do not include any amino acids therefrom between 
amino acids corresponding to Gln 660 and Met 661 of SEQ ID No. 6; 
the polypeptide contains one, two or three chains; and 
a protease domain contains amino acids 46-286 or 326-550 of SEQ ID 
No. 6 or amino acids that share at least about 60%, 70%, 80%, 90% or 95% 
20 homology to amino acids 46-286 or 326-550 of SEQ ID No. 6. Exemplary of 
the protease domains are PD1 (aa 46 to aa 286 of SEQ ID No. 6); and PD2 (aa 
323 to aa 550 SEQ ID No. 6). Any polypeptide provided herein that includes 
PD1 and PD2 of a CVSP1 6 does not include at least 5, 6, 10, 1 5 or 20 
contiguous amino acids from SEQ ID No. 21, and particularly do not include any 
25 amino acids therefrom between amino acids (if present) corresponding to Gln 660 

and Met e61 of SEQ ID No. 6. 

The CVSP1 6 can be from any animal, particularly a mammal, and includes 
but are not limited to, primates including humans, gorillas and monkeys; rodents, 
such as mice and rats; fowl, such as chickens; ruminants, such as goats, cows, 
30 deer, sheep; ovine, such as pigs and other animals. The full length zymogen or 
a multi-chain activated form is contemplated as is any domain thereof, including 
a protease domain, which can be a three-chain activated form, two-chain 
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activated form, or a single chain activated or zymogen form. An exemplary 
CVSP1 6 protein includes a sequence of amino acids set forth in SEQ ID No. 6 
that includes a protease domain or a catalytically active portion thereof. 

CVSP1 6s of interest include those that are activated and/or expressed in 
5 tumor cells at a level or amount different, typically higher, from those in non- 
tumor cells; and those from cells in which substrates therefor differ in tumor 
cells from non-tumor cells or differ with respect to the substrates, co-factors or 
receptors, or otherwise alter the activity or specificity of the MTSP. 

By active form is meant a form that is functionally active in vivo and/or in 

10 vitro. Included among these are single-chain forms of a protease domain (i.e., 
those with an N-terminus corresponding to the normal activation cleavage site) 
that are active in vitro, including in assays for protease activity and/or substrate 
or ligand binding, and are used for screening assays, and also can be used as 
immunogens for preparing antibodies. 

15 As used herein, a CVSP16 and/or a protease domain(s) thereof also can 

exist as a two-chain or other multi-chain (i.e., two or three chains) activated 
form. At least in vitro, unactivated and active single chain forms of the SPs and 
other unactivated forms and the catalytic domains or proteolytically or 
functionally active portions thereof (typically C-terminal truncations) exhibit 

20 protease or other functional activity (i.e., catalytic activity and/or other activity, 
such as substrate or ligand binding). Hence provided herein are isolated single 
chain forms of protease domains of SPs, and also multi-chain forms, and their 
use in in vitro drug screening assays for identification of agents that modulate 
the activity thereof. 

25 As used herein, a functionally active CVSP1 6 polypeptide or portion 

thereof exhibits at least one of catalytic activity, substrate binding activity or 
ligand binding activity. As described, for example, PD1 and PD2 exhibit catalytic 
(protease) activity. The CVSP16 polypeptide can be active in vitro or in vivo. 

Also contemplated are nucleic acid molecules that encode a polypeptide 

30 that has proteolytic activity in an in vitro proteolysis assay and that have at least 
80%, 85%, 90% or 95% sequence identity with the full length of a protease 
domain of a CVSP16 polypeptide as provided herein, or that hybridize along at 
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least about 70%, 80%, 90% and 95% of their full length to a nucleic acids that 
encode a protease domain, particularly under conditions of moderate, generally 
high, stringency. 

As used herein, a zymogen form of a full-length CVSP16 polypeptide or a 
5 truncated CVSP16 polypeptide containing one or a plurality of protease domains 
is one in which at least one of the domains has not undergone activation 
cleavage. A zymogen form refers to a form in which at least one protease 

domain remains unactivated. 

As used herein, activation cleavage refers to the cleavage of the protease 

10 at the N-terminus of the protease domain {generally between an R and I or an R 
and a V or elsewhere as for PD2 herein). By virtue of the Cys-Cys pairing 
between a Cys outside the protease domain and a Cys in the protease domain 
upon cleavage the resulting polypeptide has at least two chains. Cleavage can 
be effected by another protease or autocatalytically. In the exemplified CVSP1 6, 

15 the following cysteine pairings are noted: C 7Z -Cbs* C 173 -C 249 , C206-C228' ^ 239 "^ 267 ' 
C 348 -C 364 , C^-Cwe, C 472 -C 494 and C 506 -C 534 . In addition, an unpaired cysteine 
(C 159 ) in the first protease domain should pair with C 38 (by unpaired is meant 
unpaired with a Cys in the particular domain). An unpaired cysteine (C 430 ) in the 
second protease domain should pair with C 325 or C 318 . As a result the protease, 

20 upon activation cleavage can contain multiple chains, including two or three or 
more chains, formed by virtue of pairing between the unpaired cysteine in a 
protease domain with a cysteine outside a protease domain. 

As used herein, a two-chain form of a protease domain refers to a two- 
chain form that is formed from a single chain-form following activation cleavage 

25 or other cleavage of the protease. In such forms the Cys pairing between, in 
this instance, a Cys outside a protease domain and an upaired Cys in PD 1 or 
PD2, links a protease domain to the remainder of the polypeptide and the 
activation cleavage cleaves the chain. A two-chain protease domain form refers 
to any form in which the "remainder of the polypeptide" is shortened or cleaved 

30 from the full-length and includes a Cys from outside a protease domain. The 
three chain form refers to form in which two of the protease domains are 
activated in this manner. 
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As used herein, a form of a CVSP1 6 is one or more of a single chain 
form, a two-chain form, a three chain form and/or other multi-chain form and the 
form is activated or is a zymogen or includes one or more activated domains. 

As used herein, a human protein is one encoded by nucleic acid, such as 
5 DNA, present in the genome of a human, including all allelic variants and 

conservative variations as long as they are not variants found in other mammals. 

As used herein, a "nucleic acid encoding a protease domain or 
catalytically active portion of a SP" refers to a nucleic acid encoding only the 
recited single chain protease domain or active portion thereof, and not the other 
10 contiguous portions of the SP as a continuous sequence. 

As used herein, catalytic activity refers to the activity of the SP as a 
serine protease. Function of the SP refers to its function in tumor biology, 
including promotion of or involvement in initiation, growth or progression of 
tumors, and also roles in signal transduction. Catalytic activity refers to the 
1 5 activity of the SP as a protease as assessed in in vitro proteolytic assays that 
detect proteolysis of a selected substrate. 

As used herein, a CUB domain is a motif homologous to domains that 
mediate protein-protein interactions in complement components C1r/C1s. CUB 
domains have been identified in other proteases and various proteins involved in 
20 developmental processes. 

As used herein, a "propeptide" or "pro sequence" is sequence of amino 
acids positioned at the amino terminus of a mature biologically active 
polypeptide. When so-positioned, the resulting polypeptide is called a zymogen. 
Zymogens, generally, are inactive and can be converted to mature active 
25 polypeptides by catalytic or autocatalytic cleavage of the propeptide from the 
zymogen. 

As used herein, a zymogen is an inactive precursor of a proteolytic 
enzyme. Such precursors are generally larger, although not necessarily larger 
than the active form. With reference to serine proteases, zymogens are 
30 converted to active enzymes by specific cleavage, including catalytic and 

autocatalytic cleavage, or by binding of an activating co-factor, which generates 
an active enzyme. A zymogen, thus, is an enzymatically inactive protein that is 
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converted to a proteolytic enzyme by the action of an activator. Cleavage can 
be effected autocatalytically. 

As used herein, "disease or disorder" refers to a pathological condition in 
an organism resulting from, e.g., infection or genetic defect, and characterized 

5 by identifiable symptoms. 

As used herein, neoplasm (neoplasia) refers to abnormal new growth, and 
thus means the same as tumor, which can be benign or malignant. Unlike 
hyperplasia, neoplastic proliferation persists even in the absence of the original 
stimulus. 

10 As used herein, neoplastic disease refers to any disorder involving cancer, 

including tumor development, growth, metastasis and progression. 

As used herein, cancer is a general term for diseases caused by or 
characterized by any type of malignant tumor. 

As used herein, malignant, as applies to tumors, refers to primary tumors 
1 5 that have the capacity of metastasis with loss of growth control and positional 
control. 

As used herein, an anti-cancer agent (used interchangeably with "anti- 
tumor or antineoplastic agent") refers to any agents used in anti-cancer 
treatment. These include any agents, when used alone or in combination with 

20 other compounds, that can alleviate, reduce, ameliorate, prevent, or place or 
maintain in a state of remission of clinical symptoms or diagnostic markers 
associated with neoplastic disease, tumors and cancer, and can be used in 
methods, combinations and compositions provided herein. Non-limiting 
examples of antineoplastic agents include anti-angiogenic agents, alkylating 

25 agents, antimetabolite, certain natural products, platinum coordination 

complexes, anthracenediones, substituted ureas, methylhydrazine derivatives, 
adrenocortical suppressants, certain hormones, antagonists and anti-cancer 
polysaccharides. 

As used herein, a splice variant refers to a variant produced by differential 
30 processing of a primary transcript of genomic nucleic acid, such as DNA, that 

results in more than one type of mRNA. Polypeptides encoded by splice variants 



WO 2004/005471 



PCT/US2003/020959 



-27- 

of the disclosed nucleic acids encoding CVSP16 polypeptides are provided 
herein. 

As used herein, angiogenesis is intended to broadly encompass the 
totality of processes directly or indirectly involved in the establishment and 
5 maintenance of new vasculature (neovascularization), including, but not limited 
to, neovascularization associated with tumors. 

As used herein, anti-angiogenic treatment or agent refers to any 
therapeutic regimen and compound, when used alone or in combination with 
other treatment or compounds, that can alleviate, reduce, ameliorate, prevent, or 
10 place or maintain in a state of remission of clinical symptoms or diagnostic 

markers associated with undesired and/or uncontrolled angiogenesis. Thus, for 
purposes herein an anti-angiogenic agent refers to an agent that inhibits the 
establishment or maintenance of vasculature. Such agents include, but are not 
limited to, anti-tumor agents, and agents for treatments of other disorders 
1 5 associated with undesirable angiogenesis, such as diabetic retinopathies, 
restenosis, hyperproliferative disorders and others. 

As used herein, non-anti-angiogenic anti-tumor agents refer to anti-tumor 
agents that do not act primarily by inhibiting angiogenesis. 
As used herein, pro-angiogenic agents are agents that promote the 
20 establishment or maintenance of the vasculature. Such agents include agents 
for treating cardiovascular disorders, including heart attacks and strokes. 

As used herein, undesired and/or uncontrolled angiogenesis refers to 
pathological angiogenesis wherein the influence of angiogenesis stimulators 
outweighs the influence of angiogenesis inhibitors. As used herein, deficient 
25 angiogenesis refers to pathological angiogenesis associated with disorders where 
there is a defect in normal angiogenesis resulting in aberrant angiogenesis or an 
absence or substantial reduction in angiogenesis. 

As used herein, by homologous means about greater than 25% nucleic 
acid sequence identity, such as 25%, 40%, 60%, 70%, 80%, 90% or 95%. If 
30 necessary the percentage homology will be specified. The terms "homology" 
and "identity" are often used interchangeably but homology for proteins can 
include conservative amino acid changes. In general, sequences (protein or 
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nucleic acid) are aligned so that the highest order match is obtained (see, e.g.: 
Computational Molecular Biology, Lesk, A.M., ed., Oxford University Press, New 
York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., 
Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part 1 1 
5 Griffin, A.M., and Griffin, H.G., eds., Humana Press, New Jersey, 1994; 

Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; 
and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton 
Press, New York, 1991; Carillo et al. (1988) SIAM J Applied Math 45:1073). 
By sequence identity, the number of identical amino acids is determined by 

10 standard alignment algorithm programs, and used with default gap penalties 
established by each supplier. Substantially homologous nucleic acid molecules 
would hybridize typically at moderate stringency or at high stringency all along 
the length of the nucleic acid or along at least about 70%, 80% or 90% of the 
full length nucleic acid molecule of interest. Also contemplated are nucleic acid 

1 5 molecules that contain degenerate codons in place of codons in the hybridizing 
nucleic acid molecule. (For proteins, for determination of homology conservative 
amino acids can be aligned as well as identical amino acids; in this case 
percentage of identity and percentage homology vary). 

Whether any two nucleic acid molecules have nucleotide sequences that 

20 are at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% "identical" can be 
determined using known computer algorithms such as the "FAST A" program, 
using for example, the default parameters as in Pearson et al. (1988) Proc. Natl. 
Acad. Sci. USA #5:2444 (other programs include the GCG program package 
(Devereux, J„ et al.. Nucleic Acids Research 72(I):3S7 (1984)), BLASTP, 

25 BLASTN, FASTA (Atschul, S.F., et al., J Moiec Biol 2/5:403 (1 990); Guide to 
Huge Computers, Martin J. Bishop, ed.. Academic Press, San Diego, 1 994, and 
Carillo et al. (1988) SIAM J Applied Math 45:1073). For example, the BLAST 
function of the National Center for Biotechnology Information database can be 
used to determine identity. Other commercially or publicly available programs 

30 include, DNAStar "MegAlign" program (Madison, Wl) and the University of 
Wisconsin Genetics Computer Group (UWG) "Gap" program (Madison Wl)). 
Percent homology or identity of proteins and/or nucleic acid molecules can be 
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determined, for example, by comparing sequence information using a GAP 
computer program (e.g., Needleman et al. (1970) J. Mol. Biol. 43:443, as 
revised by Smith and Waterman ((1981) Adv. Appl. Math. 2:482). Briefly, a 
GAP program defines similarity as the number of aligned symbols (i.e., 
5 nucleotides or amino acids) which are similar, divided by the total number of 
symbols in the shorter of the two sequences. Default parameters for the GAP 
program can include: (1) a unary comparison matrix (containing a value of 1 for 
identities and 0 for non-identities) and the weighted comparison matrix of 
Gribskov et af. (1986) NucL Acids Res. 14:6745, as described by Schwartz and 

1 0 Dayhoff , eds., A TLAS OF PROTEIN SEQUENCE AND STRUCTURE, National Bio- 
medical Research Foundation, pp. 353-358 (1979); (2) a penalty of 3.0 for each 
gap and an additional 0.10 penalty for each symbol in each gap; and (3) no 
penalty for end gaps. Therefore, as used herein, the term "identity" represents a 
comparison between a test and a reference polypeptide or polynucleotide. 

15 As used herein, recitation that amino acids of a polypeptide correspond to 

amino acids in a disclosed sequence, such as amino acids Q660 and M661 of 
SEQ ID No. 6, refers to amino acids identified upon alignment of the polypeptide 
with the disclosed sequence to maximize identity or homology (where conserved 
amino acids are aligned) using a standard alignment algorithm, such as the GAP 

20 algorithm. 

As used herein, the term "at least 90% identical to" refers to percent 
identities from 90 to 100% relative to the reference polypeptides. Identity at a 
level of 90% or more is indicative of the fact that, assuming for exemplification 
purposes a test and reference polynucleotide length of 100 amino acids are 

25 compared, no more than 10% (i.e., 10 out of 100) of amino acids in the test 

polypeptide differs from that of the reference polypeptides. Similar comparisons 
can be made between a test and reference polynucleotides. Such differences 
can be represented as point mutations randomly distributed over the entire 
length of an amino acid sequence or they can be clustered in one or more 

30 locations of varying length up to the maximum allowable, e.g., 10/100 amino 

acid difference (approximately 90% identity). Differences are defined as nucleic 
acid or amino acid substitutions, insertions or deletions. At the level of 
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homologies or identities above about 85-90%, the result should be independent 
of the program and gap parameters set; such high levels of identity can be 
assessed readily, often without relying on software. 

As used herein, primer refers to an oligonucleotide containing two or 
5 more deoxyribonucleotides or ribonucleotides, typically more than three, from 
which synthesis of a primer extension product can be initiated. Experimental 
conditions conducive to synthesis include the presence of nucleoside 
triphosphates and an agent for polymerization and extension, such as DNA 
polymerase, and a suitable buffer, temperature and pH. 
10 As used herein, animal includes any animal, such as, but are not limited 

to primates including humans, gorillas and monkeys; rodents, such as mice and 
rats; fowl, such as chickens; ruminants, such as goats, cows, deer, sheep; 
ovine, such as pigs and other animals. Non-human animals exclude humans as 
the contemplated animal. The SPs provided herein are from any source, animal, 
15 plant, prokaryotic and fungal. Most CVSP16s are of animal origin, including 
mammalian origin. 

As used herein, genetic therapy or gene therapy involves the transfer of 
heterologous nucleic acid, such as DNA, into certain cells, target cells, of a 
mammal, particularly a human, with a disorder or conditions for which such 
20 therapy is sought. The nucleic acid, such as DNA, is introduced into the 

selected target cells, such as directly or in a vector or other delivery vehicle, in a 
manner such that the heterologous nucleic acid, such as DNA, is expressed and 
a therapeutic product encoded thereby is produced. Alternatively, the 
heterologous nucleic acid, such as DNA r can in some manner mediate expression 
25 of DNA that encodes the therapeutic product, or it can encode a product, such 
as a peptide or RNA that in some manner mediates, directly or indirectly, 
expression of a therapeutic product. Genetic therapy also can be used to deliver 
nucleic acid encoding a gene product that replaces a defective gene or 
supplements a gene product produced by the mammal or the cell in which it is 
30 introduced. The introduced nucleic acid can encode a therapeutic compound, 
such as a growth factor inhibitor thereof, or a tumor necrosis factor or inhibitor 
thereof, such as a receptor therefor, that is not normally produced in the 
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mammalian host or that is not produced in therapeutically effective amounts or 
at a therapeutically useful time. The heterologous nucleic acid, such as DNA, 
encoding the therapeutic product can be modified prior to introduction into the 
cells of the afflicted host in order to enhance or otherwise alter the product or 
5 expression thereof. Genetic therapy also can involve delivery of an inhibitor or 
repressor or other modulator of gene expression. 

As used herein, heterologous nucleic acid is nucleic acid that is not 
normally produced in vivo by the cell in which it is expressed or that is produced 
by the cell but is at a different locus or expressed differently or that mediates or 

10 encodes mediators that alter expression of endogenous nucleic acid, such as 
DNA, by affecting transcription, translation, or other regulatable biochemical 
processes. Heterologous nucleic acid is generally not endogenous to the cell 
into which it is introduced, but has been obtained from another cell or prepared 
synthetically. Heterologous nucleic acid can be endogenous, but is nucleic acid 

1 5 that is expressed from a different locus or altered in its expression. Generally, 
although not necessarily, such nucleic acid encodes RNA and proteins that are 
not normally produced by the cell or in the same way in the cell in which it is 
expressed. Heterologous nucleic acid, such as DNA, also can be referred to as 
foreign nucleic acid, such as DNA. Thus, heterologous nucleic acid or foreign 

20 nucleic acid includes a nucleic acid molecule not present in the exact orientation 
or position as the counterpart nucleic acid molecule, such as DNA, is found in a 
genome. It also can refer to a nucleic acid molecule from another organism or 
species (i.e., exogenous). 

Any nucleic acid, such as DNA, that one of skill in the art would 

25 recognize or consider as heterologous or foreign to the cell in which the nucleic 
acid is expressed is herein encompassed by heterologous nucleic acid; 
heterologous nucleic acid includes exogenously added nucleic acid that also is 
expressed endogenously. Examples of heterologous nucleic acid include, but are 
not limited to, nucleic acid that encodes traceable marker proteins, such as a 

30 protein that confers drug resistance, nucleic acid that encodes therapeutically 
effective substances, such as anti-cancer agents, enzymes and hormones, and 
nucleic acid, such as DNA, that encodes other types of proteins, such as 
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antibodies. Antibodies that are encoded by heterologous nucleic acid can be 
secreted or expressed on the surface of the cell in which the heterologous 
nucleic acid has been introduced. 

As used herein, a therapeutically effective product for gene therapy is a 
5 product that is encoded by heterologous nucleic acid, typically DNA, that, upon 
introduction of the nucleic acid into a host, a product is expressed that 
ameliorates or eliminates the symptoms, manifestations of an inherited or 
acquired disease or that cures the disease. Also included are biologically active 
nucleic acid molecules, such as RNAi and antisense. 

10 As used herein, recitation that a polypeptide consists essentially of the 

protease domain means that the only SP portion of the polypeptide is a protease 
domain or a catalytically active portion thereof. The polypeptide can optionally, 
and generally will, include additional non-SP-derived sequences of amino acids. 
As used herein, cancer or tumor treatment or agent refers to any 

1 5 therapeutic regimen and/or compound that, when used alone or in combination 
with other treatments or compounds, can alleviate, reduce, ameliorate, prevent, 
or place or maintain in a state of remission of clinical symptoms or diagnostic 
markers associated with deficient angiogenesis. 

As used herein, domain refers to a portion of a molecule, e.g., proteins 

20 or the encoding nucleic acids, that is structurally and/or functionally distinct from 
other portions of the molecule. 

As used herein, nucleic acids include DNA, RNA and analogs thereof, 
including peptide nucleic acids (PNA) and mixtures thereof. Nucleic acids can be 
single or double-stranded. When referring to probes or primers, which are 

25 optionally labeled, such as with a detectable label, such as a fluorescent or 
radiolabel, single-stranded molecules are contemplated. Such molecules are 
typically of a length such that their target is statistically unique or of low copy 
number (typically less than 5, generally less than 3) for probing or priming a 
library. Generally a probe or primer contains at least 14, 16 or 30 contiguous 

30 nucleotides of sequence complementary to or identical to a gene of interest. 
Probes and primers can be 10, 20, 30, 50, 100 or more nucleic acids long. 
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As used herein, a probe or primer based on a nucleotide sequence 
disclosed herein, includes at least 10, 14, typically at least 16 contiguous 
nucleotides of SEQ ID No. 5, and probes of at least 30, 50 or 100 contiguous 
nucleotides of SEQ ID No. 5. The length of the probe or primer required for 
5 unique hybridization is a function of the complexity of the genome of interest. 

As used herein, nucleic acid encoding a fragment or portion of a SP refers 
to a nucleic acid encoding only the recited fragment or portion of a SP, and not 
the other contiguous portions of the SP. 

As used herein, operative linkage of heterologous nucleic to regulatory 

1 0 and effector sequences of nucleotides, such as promoters, enhancers, 

transcriptional and translational stop sites, and other signal sequences refers to 
the relationship between such nucleic acid, such as DNA, and such sequences of 
nucleotides. For example, operative linkage of heterologous DNA to a promoter 
refers to the physical relationship between the DNA and the promoter such that 

1 5 the transcription of such DNA is initiated from the promoter by an RNA 

polymerase that specifically recognizes, binds to and transcribes the DNA. 
Thus, operatively linked or operationally associated refers to the functional 
relationship of nucleic acid, such as DNA, with regulatory and effector 
sequences of nucleotides, such as promoters, enhancers, transcriptional and 

20 translational stop sites, and other signal sequences. For example, operative 
linkage of DNA to a promoter refers to the physical and functional relationship 
between the DNA and the promoter such that the transcription of such DNA is 
initiated from the promoter by an RNA polymerase that specifically recognizes, 
binds to and transcribes the DNA. In order to optimize expression and/or in vitro 

25 transcription, it can be necessary to remove, add or alter 5' untranslated portions 
of the clones to eliminate extra, potentially inappropriate alternative translation 
initiation (i.e., start) codons or other sequences that can interfere with or reduce 
expression, either at the level of transcription or translation. Alternatively, 
consensus ribosome binding sites (see, e.g., Kozak J. Biol. Chem. 266:19867- 

30 19870 (1991)) can be inserted immediately 5' of the start codon and can 

enhance expression. The desirability of (or need for) such modification can be 
empirically determined. 
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As used herein, a sequence complementary to at least a portion of an 
RNA, with reference to antisense oligonucleotides, means a sequence having 
sufficient complementarity to be able to hybridize with the RNA, generally under 
moderate or high stringency conditions, forming a stable duplex; in the case of 
5 double-stranded SP antisense nucleic acids, a single strand of the duplex DNA 
(or dsRNA) can thus be tested, or triplex formation can be assayed. The ability 
to hybridize depends on the degree of complementarily and the length of the 
antisense nucleic acid. Generally, the longer the hybridizing nucleic acid, the 
more base mismatches with a SP encoding RNA it can contain and still form a 

10 stable duplex (or triplex, as the case can be). One skilled in the art can ascertain 
a tolerable degree of mismatch by use of standard procedures to determine the 
melting point of the hybridized complex. 

For purposes herein, amino acid substitutions, deletions and/or insertions, 
can be made in any of CVSPs and protease domains thereof provided that the 

15 resulting protein exhibits protease activity or other activity (or, if desired, such 
changes can be made to eliminate activity). Muteins can be made by making 
conservative amino acid substitutions and also non-conservative amino acid 
substitutions. For example, amino acid substitutions that desirably or 
advantageously alter properties of the proteins can be made. In one 

20 embodiment, mutations that prevent degradation of the polypeptide can be 

made. Many proteases cleave after basic residues, such as R and K; to eliminate 
such cleavage, the basic residue is replaced with a non-basic residue. 
Interaction of the protease with an inhibitor can be blocked while retaining 
catalytic activity by effecting a non-conservative change at the site of interaction 

25 of the inhibitor with the protease. Other activities also can be altered. For 
example, receptor binding can be altered without altering catalytic activity. 

Amino acid substitutions contemplated include conservative substitutions, 
such as those set forth in Table 1 , which do not eliminate proteolytic activity. 
As described herein, substitutions that alter properties of the proteins, such as 

30 removal of cleavage sites and other such sites are also contemplated; such 
substitutions are generally non-conservative, but can be readily effected by 
those of skill in the art. 
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Suitable conservative substitutions of amino acids are known to those of 
skill in this art and can be made generally without altering the biological activity, 
for example enzymatic activity, of the resulting molecule. Those of skill in this 
art recognize that, in general, single amino acid substitutions in non-essential 
5 regions of a polypeptide do not substantially alter biological activity (see, e.g., 
Watson et al. Molecular Biology of the Gene, 4th Edition, 1987, The 
Benjamin/Cummings Pub. co., p.224). Also included within the definition, is the 
catalytically active fragment of a SP, particularly a single chain protease. portion. 
Conservative amino acid substitutions are made, for example, in accordance 



10 with those set forth in TABLE 1 as follows: 

TABLE 1 





Original residue 


Conservative substitution 




Ala (A) 


Gly; Ser, Abu 




Arg <R) 


Lys, om 


15 


Asn (N) 


Gin; His 




Cys (C) 


Ser 




Gin (Q) 


Asn 




Glu (E) 


Asp 




Gly (G) 


Ala; Pro 


20 


His (H) 


Asn; Gin 




He (l> 


Leu; Val; Met; Nle; Nva 




Leu (L) 


He; Val; Met; Nle; Nv 




Lys (K) 


Arg; Gin; Glu 




Met (M) 


Leu; Tyr; lie; NLe Val 


25 


Ornithine 


Lys; Arg 




Phe (F) 


Met; Leu; Tyr 




Ser (S) 


Thr 




Thr (T) 


Ser 




Trp (W) 


Tyr 


30 


Tyr (Y) 


Trp; Phe 




Val (V) 


lie; Leu; Met; Nle; Nv 



Other substitutions are also permissible and can be determined empirically or in 
accord with known conservative substitutions. 

As used herein, Abu is 2-aminobutyric acid; Orn is ornithine. 
35 As used herein, the amino acids, which occur in the various amino acid 

sequences appearing herein, are identified according to their well-known, three- 
letter or one-letter abbreviations. The nucleotides, which occur in the various 
DNA fragments, are designated with the standard single-letter designations used 
routinely in the art. 
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As used herein, amelioration of the symptoms of a particular disorder by 
administration of a particular pharmaceutical composition refers to any lessening, 
whether permanent or temporary, lasting or transient that can be attributed to or 
associated with administration of the composition. 
5 As used herein, antisense polynucleotides refer to synthetic sequences of 

nucleotide bases complementary to mRNA or the sense strand of double- 
stranded DNA. Admixture of sense and antisense polynucleotides under 
appropriate conditions leads to the binding of the two molecules, or 
hybridization. When these polynucleotides bind to (hybridize with) mRNA, 

10 inhibition of protein synthesis (translation) occurs. When these polynucleotides 
bind to double-stranded DNA, inhibition of RNA synthesis (transcription) occurs. 
The resulting inhibition of translation and/or transcription leads to an inhibition of 
the synthesis of the protein encoded by the sense strand. Antisense nucleic 
acid molecules typically contain a sufficient number of nucleotides to specifically 

1 5 bind to a target nucleic acid, generally at least 5 contiguous nucleotides, often at 
least 1 4 or 1 6 or 30 contiguous nucleotides or modified nucleotides 
complementary to the coding portion of a nucleic acid molecule that encodes a 
gene of interest, for example, nucleic acid encoding a single chain protease 
domain of a SP. 

20 As used herein, an array refers to a collection of elements, such as 

antibodies; containing two, three or more members. An addressable array is one 
in which the members of the array are identifiable, typically by position on a 
solid phase support. Hence, in general the members of the array are immobilized 
on discrete identifiable loci on the surface of a solid phase. 

25 As used herein, antibody refers to an immunoglobulin, whether natural or 

partially or wholly synthetically produced, including any derivative thereof that 
retains the specific binding ability of the antibody. Hence antibody includes any 
protein having a binding domain that is homologous or substantially homologous 
to an immunoglobulin binding domain. Antibodies include members of any 

30 immunoglobulin class, including |gG, IgM, IgA, IgD and IgE. 

As used herein, antibody fragment refers to any derivative of an antibody 
that is less then full length, retaining at least a portion of the full-length 
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antibody's specific binding ability. Examples of antibody fragments include, but 
are not limited to, Fab, Fab', F{ab) 2 , single-chain Fvs {scFV), FV, dsFV diabody 
and Fd fragments. The fragment can include multiple chains linked together, 
such as by disulfide bridges. An antibody fragment generally contains at least 
5 about 50 amino acids and typically at least 200 amino acids. 

As used herein, a Fv antibody fragment is composed of one variable 
heavy chain domain (V H ) and one variable light chain domain linked by 
noncovalent interactions. 

As used herein, a dsFV refers to an Fv with an engineered intermolecular 
10 disulfide bond, which stabilizes the V H -V L pair. 

As used herein, a F(ab) 2 fragment is an antibody fragment that results 
from digestion of an immunoglobulin with pepsin at pH 4.0-4.5; it can be 
recombinantly produced to produce the equivalent fragment. 

As used herein, Fab fragments are antibody fragments that result from 
15 digestion of an immunoglobulin with papain; it can be recombinantly produced 
to produce the equivalent fragment. 

As used herein, scFVs refer to antibody fragments that contain a variable 
light chain (V L ) and variable heavy chain (V H ) covalently connected by a 
polypeptide linker in any order. The linker is of a length such that the two 
20 variable domains are bridged without substantial interference. Included linkers 
are (GIy-Ser) n residues with some Glu or Lys residues dispersed throughout to 
increase solubility. 

As used herein, humanized antibodies refer to antibodies that are 
modified to include human sequences of amino acids so that administration to a 
25 human does not provoke an immune response. Methods for preparation of such 
antibodies are known. For example, to produce such antibodies, the encoding 
nucleic acid in the hybridoma or other prokaryotic or eukaryotic cell, such as an 
£. co// or a CHO cell, that expresses the monoclonal antibody is altered by 
recombinant nucleic acid techniques to express an antibody in which the amino 
30 acid composition of the non-variable region is based on human antibodies. 

Computer programs have been designed to identify such non-variable regions. 
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As used herein, diabodies are dimeric scFV; diabodies typically have 
shorter peptide linkers than scFvs, and they generally dimerize. 

As used herein, production by recombinant means by using recombinant 
DNA methods means the use of the well known methods of molecular biology 
5 for expressing proteins encoded by cloned DNA. 

As used herein the term assessing is intended to include 
quantitative and qualitative determination in the sense of obtaining an 
absolute value for the activity of a SP, or a domain thereof, present in the 
sample, and also of obtaining an index, ratio, percentage, visual or other value 
10 indicative of the level of the activity. Assessment can be direct or indirect and 
the chemical species actually detected need not of course be the proteolysis 
product itself but can for example be a derivative thereof or some further 
substance. 

As used herein, biological activity refers to the in vivo activities of a 
1 5 compound or physiological responses that result upon in vivo administration of a 
compound, composition or other mixture. Biological activity, thus, encompasses 
therapeutic effects and pharmaceutical activity of such compounds, 
compositions and mixtures. Biological activities can be observed in in vitro 
systems designed to test or use such activities. Thus, for purposes herein a 
20 biological activity of a protease its catalytic activity in which a polypeptide is 
hydroiyzed. 

As used herein, functional activity refers to an activity or activities of a 
polypeptide or portion thereof associated with a full-length (complete) protein. 
Functional activities include, but are not limited to, biological activity, catalytic or 
25 enzymatic activity, antigenicity {ability to bind to or compete with a polypeptide 
for binding to an anti-polypeptide antibody), immunogenicrty, ability to form 
multimers, and the ability to specifically bind to a receptor or ligand for the 
polypeptide. 

As used herein, a conjugate refers to the compounds provided herein that 
30 include one or more SPs, including a CVSP16, particularly single chain protease 
domains thereof, and one or more targeting agents. These conjugates include 
those produced by recombinant means as fusion proteins, those produced by 
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chemical means, such as by chemical coupling, through, for example, coupling 
to sulfhydryl groups, and those produced by any other method whereby at least 
one SP, or a domain thereof, is linked, directly or indirectly via linker(s) to a 
targeting agent. 

5 As used herein, a targeting agent, is any moiety, such as a protein or 

effective portion thereof, that provides specific binding of the conjugate to a cell 
surface receptor, which in some instances can internalize bound conjugates or 
portions thereof. A targeting agent also can be one that promotes or facilitates, 
for example, affinity isolation or purification of the conjugate; attachment of the 
1 0 conjugate to a surface; or detection of the conjugate or complexes containing 
the conjugate. 

As used herein, an antibody conjugate refers to a conjugate in which the 
targeting agent is an antibody. 

As used herein, derivative or analog of a molecule refers to a portion 

1 5 derived from or a modified version of the molecule. 

As used herein, an effective amount of a compound for treating a 
particular disease is an amount that is sufficient to ameliorate, or in some 
manner reduce the symptoms associated with the disease. Such an amount can 
be administered as a single dosage or can be administered according to a 

20 regimen, whereby it is effective. The amount can cure the disease but, typically, 
is administered in order to ameliorate the symptoms of the disease. Repeated 
administration can be required to achieve the desired amelioration of symptoms. 

As used herein equivalent, when referring to two sequences of nucleic 
acids, means that the two sequences in question encode the same sequence of 

25 amino acids or -equivalent proteins. When equivalent is used in referring to two 
proteins or peptides, it means that the two proteins or peptides have 
substantially the same amino acid sequence with only amino acid substitutions 
(such as, but not limited to, conservative changes such as those set forth in 
Table 1 , above) that do not substantially alter the activity or function of the 

30 protein or peptide. When equivalent refers to a property, the property does not 
need to be present to the same extent (e.g., two peptides can exhibit different 
rates of the same type of enzymatic activity), but the activities are usually 
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substantially the same. Complementary, when referring to two nucleotide 
sequences, means that the two sequences of nucleotides are capable of 
hybridizing, typically with less than 25%, 15% or 5% mismatches between 
opposed nucleotides. If necessary, the percentage of complementarity will be 
5 specified. Typically the two molecules are selected such that they will hybridize 
under conditions of high stringency. 

As used herein, an agent that modulates the activity of a protein or 
expression of a gene or nucleic acid either decreases or increases or otherwise 
alters the activity of the protein or, in some manner, up- or down-regulates or 

10 otherwise alters expression of the nucleic acid in a cell. 

As used herein, inhibitor of the activity of a SP encompasses any 
substances that prohibit or decrease production, post-translational 
modification (s), maturation, or membrane localization of the SP or any substance 
that interferes with or decreases the proteolytic efficacy thereof, particularly of a 

1 5 single chain form in an in vitro screening assay. 

As used herein, a method for treating or preventing neoplastic disease 
means that any of the symptoms, such as the tumor, metastasis thereof, the 
vascularization of the tumors or other parameters by which the disease is 
characterized are reduced, ameliorated, prevented, placed in a state of remission, 

20 or maintained in a state of remission. It also means that the hallmarks of 

neoplastic disease and metastasis can be eliminated, reduced or prevented by 
the treatment. Non-limiting examples of the hallmarks include uncontrolled 
degradation of the basement membrane and proximal extracellular matrix, 
migration, division, and organization of the endothelial cells into new functioning 

25 capillaries, and the persistence of such functioning capillaries. 

As used herein, pharmaceutical^ acceptable salts, esters or other 
derivatives of the conjugates include any salts, esters or derivatives that can be 
readily prepared by those of skill in this art using known methods for such 
derivatization and that produce compounds that can be administered to animals 

30 or humans without substantial toxic effects and that either are pharmaceutical^ 
active or are prodrugs. 
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As used herein, a prodrug is a compound that, upon in vivo 
administration, is metabolized or otherwise converted to the biologically, 
pharmaceutical^ or therapeutically active form of the compound. To produce a 
prodrug, the pharmaceutical^ active compound is modified such that the active 
5 compound is regenerated by metabolic processes. The prodrug can be designed 
to alter the metabolic stability or the transport characteristics of a drug, to mask 
side effects or toxicity, to improve the flavor of a drug or to alter other 
characteristics or properties of a drug. By virtue of knowledge of 
pharmacodynamic processes and drug metabolism in vivo, those of skill in this 

10 art, once a pharmaceutical^ active compound is known, can design prodrugs of 
the compound (see, e.g., Nogrady (1985) Medicinal Chemistry A Biochemical 
Approach, Oxford University Press, New York, pages 388-392). 

As used herein, a drug identified by the screening methods provided 
herein refers to any compound that is a candidate for use as a therapeutic or as 

1 5 a lead compound for the design of a therapeutic. Such compounds can be small 
molecules, including small organic molecules, peptides, peptide mimetics, 
antisense molecules or dsRNA, such as RNAi, antibodies, fragments of 
antibodies, recombinant antibodies and other such compound which can serve 
as drug candidate or lead compound. 

20 As used herein, a peptidomimetic is a compound that mimics the 

conformation and certain stereochemical features of the biologically active form 
of a particular peptide. In general, peptidomimetics are designed to mimic 
certain desirable properties of a compound, but not the undesirable properties, 
such as flexibility, that lead to a loss of a biologically active conformation and 

25 bond breakdown. Peptidomimetics can be prepared from biologically active 
compounds by replacing certain groups or bonds that contribute to the 
undesirable properties with bioisosteres. Bioisosteres are known to those of 
skill in the art. For example the methylene bioisostere CH 2 S has been used as an 
amide replacement in enkephalin analogs (see, e.g., Spatola (1983) pp. 267-357 

30 in Chemistry and Biochemistry of Amino Acids, Peptides, and Proteins, 

Weinstein, Ed. volume 7, Marcel Dekker, New York). Morphine, which can be 
administered orally, is a compound that is a peptidomimetic of the peptide 
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endorphin. For purposes herein, cyclic peptides are included among 

peptidomimetics. 

As used herein, a promoter region or promoter element refers to a 
segment of DNA or RNA that controls transcription of the DNA or RNA to which 
5 It is operatively linked. The promoter region includes specific sequences that are 
sufficient for RNA polymerase recognition, binding and transcription initiation. 
This portion of the promoter region is referred to as the promoter. In addition, 
the promoter region includes sequences that modulate this recognition, binding 
and transcription initiation activity of RNA polymerase. These sequences can be 
10 cis acting or can be responsive to trans acting factors. Promoters, depending 
upon the nature of the regulation, can be constitutive or regulated. Exemplary 
promoters contemplated for use in prokaryotes include the bacteriophage T7 and 
T3 promoters. 

As used herein, a receptor refers to a molecule that has an affinity for a 
15 given ligand. Receptors can be naturally-occurring or synthetic molecules. 

Receptors also can be referred to in the art as anti-ligands. As used herein, the 
receptor and anti-ligand are interchangeable. Receptors can be used in their 
unaltered state or bound to other polypeptides, including as homodimers. 
Receptors can be attached to, covalently or noncovalently, or in physical contact 
20 with, a binding member, either directly or indirectly via a specific binding 
substance or linker. Examples of receptors, include, but are not limited to: 
antibodies, cell membrane receptors surface receptors and internalizing 
receptors, monoclonal antibodies and antisera reactive with specific antigenic 
determinants [such as on viruses, cells, or other materials], drugs, 
25 polynucleotides, nucleic acids, peptides, cofactors, lectins, sugars, 
polysaccharides, cells, cellular membranes, and organelles. 

Examples of receptors and applications using such receptors, include but 

are not restricted to: 

a) enzymes: specific transport proteins or enzymes essential to survival 
30 of microorganisms, which could serve as targets for antibiotic [ligand] selection; 

b) antibodies: identification of a ligand-binding site on the antibody 
molecule that combines with the epitope of an antigen of interest can be 
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investigated; determination of a sequence that mimics an antigenic epitope can 
lead to the development of vaccines of which the immunogen is based on one or 
more of such sequences or lead to the development of related diagnostic agents 
or compounds useful in therapeutic treatments such as for auto-immune diseases 
5 c) nucleic acids: identification of iigand, such as protein or RNA, binding 

sites; 

d) catalytic polypeptides: polymers, including polypeptides, that are 
capable of promoting a chemical reaction involving the conversion of one or 
more reactants to one or more products; such polypeptides generally include a 

10 binding site specific for at least one reactant or reaction intermediate and an 
active functionality proximate to the binding site, in which the functionality is 
capable of chemically modifying the bound reactant (see, e.g., U.S. Patent No. 
5,215,899); 

e) hormone receptors: determination of the ligands that bind with high 
1 5 affinity to a receptor is useful in the development of hormone replacement 

therapies; for example, identification of ligands that bind to such receptors can 
lead to the development of drugs to control blood pressure; and 

f) opiate receptors: determination of ligands that bind to the opiate 
receptors in the brain is useful in the development of less-addictive replacements 

20 for morphine and related drugs. 

As used herein, sample refers to anything which can contain an analyte 
for which an analyte assay is desired. The sample can be a biological sample, 
such as a biological fluid or a biological tissue. Examples of biological fluids 
include urine, blood, plasma, serum, saliva, semen, stool, sputum, cerebral spinal 

25 fluid, tears, mucus, amniotic fluid or the like. Biological tissues are aggregates 
of cells, usually of a particular kind together with their intercellular substance 
that form one of the structural materials of a human, animal, plant, bacterial, 
fungal or viral structure, including connective, epithelium, muscle and nerve 
tissues. Examples of biological tissues also include organs, tumors, lymph 

30 nodes, arteries and individual cell(s). 

As used herein: stringency of hybridization in determining percentage 
mismatch is as follows: 
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1) high stringency: 0.1 x SSPE, 0.1% SDS, 65 °C 

2) medium stringency: 0.2 x SSPE, 0.1 % SDS, 50°C 

3) low stringency: 1 .0 x SSPE, 0.1 % SDS, 50°C 

Those of skill in this art know that the washing step selects for stable 
5 hybrids and also know the ingredients of SSPE (see, e.g., Sambrook, E.F. 

Fritsch, T. Maniatis, in: Molecular Cloning, A Laboratory Manual, Cold Spring 
Harbor Laboratory Press (1989), voL 3, p. B.13, see, also, numerous catalogs 
that describe commonly used laboratory solutions). SSPE is pH 7.4 phosphate- 
buffered 0.18 M NaCI. Further, those of skill in the art recognize that the 

1 0 stability of hybrids is determined by T m , which is a function of the sodium ion 
concentration and temperature (T m = 81 .5° C-1 6.6(log 10 [Na + ]) + 0;41(%G + C)- 
600/D), so that the only parameters in the wash conditions critical to hybrid 
stability are sodium ion concentration in the SSPE (or SSC) and temperature. 
It is understood that equivalent stringencies can be achieved using 

1 5 alternative buffers, salts and temperatures. By way of example and not 

limitation, procedures using conditions of low stringency are as follows (see also 
Shilo and Weinberg, Proc. Natl. Acad. ScL USA 75:6789-6792 (1981)): Filters 
containing DNA are pretreated for 6 hours at 40°C in a solution containing 35% 
formamide, 5X SSG, 50 mM Tris-HCI (pH 7.5), 5 mM EDTA, 0.1 % PVP, 0.1 % 

20 Ficoll, 1 % BSA, and 500 //g/ml denatured salmon sperm DNA (10X SSC is 1 .5 
M sodium chloride, and 0.15 M sodium citrate, adjusted to a pH of 7). 

Hybridizations are carried out in the same solution with the following 
modifications: 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100//g/ml salmon sperm 
DNA, 10% (wt/vol) dextran sulfate, and 5-20 X 10 6 cpm 32 P-labeled probe is 

25 used. Filters are incubated in hybridization mixture for 18-20 hours at 40°C, 

and then washed for 1 .5 hours at 55 °C in a solution containing 2X SSC, 25 mM 
Tris-HCI (pH 7.4), 5 mM EDTA, and 0.1% SDS. The wash solution is replaced 
with fresh solution and incubated an additional 1.5 hours at 60°C. Filters are 
blotted dry and exposed for autoradiography. If necessary, filters are washed for 

30 a third time at 65-68 °C and reexposed to film. Other conditions of low 

stringency which can be used are well known in the art (e.g., as employed for 
cross-species hybridizations). 
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By way of example and not way of limitation, procedures using 
conditions of moderate stringency include, for example, but are not limited to, 
procedures using such conditions of moderate stringency are as follows: Filters 
containing DNA are pretreated for 6 hours at 55° C in a solution containing 6X 
5 SSC, 5X Denhart's solution, 0.5% SDS and 100 //g/ml denatured salmon sperm 
DNA. Hybridizations are carried out in the same solution and 5-20 X 10 6 cpm 
32 P-labeled probe is used. Filters are incubated in hybridization mixture for 1 8-20 
hours at 55 °C, and then washed twice for 30 minutes at 60°C in a solution 
containing 1X SSC and 0.1% SDS. Filters are blotted dry and exposed for 

10 autoradiography. Other conditions of moderate stringency which can be used 
are well-known in the art. Washing of filters is done at 37 °C for 1 hour in a 
solution containing 2X SSC, 0.1% SDS. 

By way of example and not way of limitation, procedures using conditions 
of high stringency are as follows: Prehybridization of filters containing DNA is 

15 carried out for 8 hours to overnight at 65 °C in buffer composed of 6X SSC, 

50 mM Tris-HCI (pH 7.5), 1 mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.02% BSA, 
and 500 //g/ml denatured salmon sperm DNA. Filters are hybridized for 48 hours 
at 65°C in prehybridization mixture containing 100 //g/ml denatured salmon 
sperm DNA and 5-20 X 10 6 cpm of 32 P-Iabeled probe. Washing of filters is done 

20 at 37 °C for 1 hour in a solution containing 2X SSC, 0.01 % PVP, 0.01 % Ficoll, 
and 0.01 % BSA. This is followed by a wash in 0.1X SSC at 50°C for 45 
minutes before autoradiography. Other conditions of high stringency which can 
be used are well known in the art. 

The term substantially identical or homologous or similar varies with the 

25 context as understood by those skilled in the relevant art and generally means at 
least 60% or 70%, preferably means at least 80%, more preferably at least 
90%, and most preferably at least 95% identity. 

As used herein, substantially identical to a product means sufficiently 
similar so that the property of interest is sufficiently unchanged so that the 

30 substantially identical product can be used in place of the product. 

As used herein, substantially pure means sufficiently homogeneous to 
appear free of readily detectable impurities as determined by standard methods 
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of analysis, such as thin layer chromatography (TLC), gel electrophoresis and 
high performance liquid chromatography (HPLC), used by those of skill in the art 
to assess such purity, or sufficiently pure such that further purification would 
not detectably alter the physical and chemical properties, such as enzymatic and 
5 biological activities, of the substance. Methods for purification of the 

compounds to produce substantially chemically pure compounds are known to 
those of skill in the art. A substantially chemically pure compound can, 
however, be a mixture of stereoisomers or isomers. In such instances, further 
purification might increase the specific activity of the compound. 

10 As used herein, target cell refers to a ceil that expresses a SP in vivo. 

As used herein, test substance (or test compound) refers to a chemically 
defined compound (e.g., organic molecules, inorganic molecules, 
organic/inorganic molecules, proteins, peptides, nucleic acids, oligonucleotides, 
lipids, polysaccharides, saccharides, or hybrids among these molecules such as 

15 glycoproteins, and others) or mixtures of compounds (e.g., a library of test 

compounds, natural extracts or culture supernatants and others) whose effect on 
a SP, particularly a single chain form that includes the protease domain or a 
sufficient portion thereof for activity, as determined by an in vitro method, such 
as the assays provided herein, is tested. 

20 As used herein, a molecule, such as an antibody, that specifically binds to 

a polypeptide typically has a binding affinity (K a ) of at least about 10 6 l/mol, 10 7 
l/mol, 10 8 l/mol, 10 9 l/mol, 10 10 l/mol or greater and binds to a protein of interest 
generally with at least 2-fold, 5-fold, generally 10-fold or even 100-fold or 
greater, affinity than to other proteins. For example, an antibody that 

25 specifically binds to the protease domain compared to the full-length molecule, 
such as the zymogen form, binds with at least about 2-fold, typically 5-fold or 
1 0-fold higher affinity, to a polypeptide that contains only the protease domain 
than to the zymogen form of the full-length. Such specific binding also is 
referred to as selective binding. Thus, specific or selective binding refers to 

30 greater binding affinity (generally at least 2-fold, 5-fold, 10-fold or more) to a 
targeted site or locus compared to a non-targeted site or locus. 
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As used herein, the terms a therapeutic agent, therapeutic regimen, 
radioprotectant, or chemotherapeutic mean conventional drugs and drug 
therapies, including vaccines, which are known to those skilled in the art. 
Radiotherapeutic agents are well known in the art. 
5 As used herein, treatment means any manner in which the symptoms of a 

condition, disorder or disease are ameliorated or otherwise beneficially altered. 
Treatment also encompasses any pharmaceutical use of the compositions herein. 

As used herein, vector (or plasmid) refers to discrete elements that are 
used to introduce heterologous nucleic acid into cells for either expression or 

10 replication thereof. The vectors typically remain episomal, but can be designed 
to effect integration of a gene or portion thereof into a chromosome of the 
genome. Also contemplated are vectors that are artificial chromosomes, such as 
yeast artificial chromosomes and mammalian artificial chromosomes. Selection 
and use of such vehicles are well known to those of skill in the art. An 

1 5 expression vector includes vectors capable of expressing DNA that is operatively 
linked with regulatory sequences, such as promoter regions, that are capable of 
effecting expression of such DNA fragments. Thus, an expression vector refers 
to a recombinant DNA or RNA construct, such as a plasmid, a phage, 
recombinant virus or other vector that, upon introduction into an appropriate 

20 host cell, results in expression of the cloned DNA. Appropriate expression 
vectors are well known to those of skill in the art and include those that are 
replicable in eukaryotic ceils and/or prokaryotic cells and those that remain 
episomal or those which integrate into the host cell genome. 

As used herein, protein binding sequence refers to a protein or peptide 

25 sequence that is capable of specific binding to other protein or peptide 

sequences generally, to a set of protein or peptide sequences or to a particular 
protein or peptide sequence. 

As used herein, epitope tag refers to a short stretch of amino acid 
residues corresponding to an epitope to facilitate subsequent biochemical and 

30 immunological analysis of the epitope tagged protein or peptide. Epitope tagging 
is achieved by adding the sequence of the epitope tag to a protein-encoding 
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sequence in an appropriate expression vector. Epitope tagged proteins can be 
affinity purified using highly specific antibodies raised against the tags. 

As used herein, metal binding sequence refers to a protein or peptide 
sequence that is capable of specific binding to metal ions generally, to a set of 
5 metal ions or to a particular metal ion. 

As used herein, a combination refers to any association between two or 
among more items. 

As used herein, -a composition refers to any mixture. It can be a solution, 
a suspension, liquid, powder, a paste, aqueous, non-aqueous or any combination 
10 thereof. 

As used herein, fluid refers to any composition that can flow. Fluids thus 
encompass compositions that are in the form of semi-solids, pastes, solutions, 
aqueous mixtures, gels, lotions, creams and other such compositions. 

As used herein, a kit is a packaged combination optionally including 
1 5 instructions for use of the combination and/or other reactions and components 
for such use. 

As used herein, a cellular extract refers to a preparation or fraction which 
is made from a lysed or disrupted cell. 

As used herein, an agent is said to be randomly selected when the agent 

20 is chosen randomly without considering the specific sequences involved in the 
association of a protein alone or with its associated substrates, binding partners 
and/or other components. An example of randomly selected agents is the use a 
chemical library or a peptide combinatorial library, or a growth broth of an 
organism or conditioned medium. 

25 As used herein, an agent is said to be rationally selected or designed 

when the agent is chosen on a non-random basis which takes into account the 
sequence of the target site and/or its conformation in connection with the 
agent's action. As described in the Examples, there are proposed binding sites 
for serine protease and {catalytic) sites in the protein having SEQ ID No. 6. 

30 Agents can be rationally selected or rationally designed by using the peptide 
sequences that make up these sites. 
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For clarity of disclosure, and not by way of limitation, the detailed 
description is divided into the subsections that follow. 

B. CVSP16 POLYPEPTIDES, MUTEINS, DERIVATIVES AND ANALOGS 
THEREOF 

5 SPs 

The serine proteases (SPs) are a family of proteins found in mammals and 
also other species. SPs share a number of common structural features as 
described herein. The proteolytic domains share sequence homology generally 
including conserved His, Asp, and Ser residues necessary for efficient catalytic 

10 activity that are present in conserved motifs. These SPs generally are 

synthesized as zymogens, and activated to multi-chain forms [i.e., two-chains, 
three-chains) by specific cleavage. 

The SP family can be targeted for therapeutic intervention and also can 
serve as diagnostic markers for tumor initiation, development, growth and/or 

15 progression. As discussed, members of this family are involved in proteolytic 
processes that are implicated in tumor development, growth and/or progression. 
This implication is based upon their functions as proteolytic enzymes in 
extracellular matrix degradation and remodelling and growth- and pro-angiogenic 
factor activation. In addition, their levels of expression or level of activation or 

20 their apparent activity resulting from substrate and/or co-factor levels or 

alterations in substrates and/or co-factors and levels thereof differ in tumor cells 
from non-tumor cells in the same tissue. Hence, protocols and treatments that 
alter their activity, such as their proteolytic activities and roles in signal 
transduction, and/or their expression, such as by contacting them with a 

25 compound that modulates their activity and/or expression, could impact tumor 
development, growth and/or progression. Also, in some instances, the level of 
activation and/or expression can be altered in tumors, such as pancreas, 
stomach, uterus, lung, colon and cervical cancers, and also breast, prostate or 
Ieukemias. The SP, thus, can serve as a diagnostic marker for tumors. 

30 In other instances the SP protein can exhibit altered activity by virtue of a 

change in activity or expression of a co-factor therefor or a substrate therefor. 
Detection of the SPs, particularly the protease domains, in body fluids, such as 
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serum, blood, saliva, cerebral spinal fluid, synovial fluid and interstitial fluids, 
urine, sweat and other such fluids and secretions, can serve as a diagnostic 
tumor marker. In particular, detection of higher levels of such polypeptides in a 
subject compared to a subject known not to have any neoplastic disease or 
5 compared to earlier samples from the same subject, can be indicative of 
neoplastic disease in the subject. 
CVSP16 

Provided are polypeptides family members designated CVSP1 6. The 
CVSP1 6s provided herein are serine proteases that are expressed and/or 

10 activated in certain tumors; hence their activation or expression can serve as a 
diagnostic marker for tumor development, growth and/or progression. The 
CVSP1 6 polypeptides provided herein and can be used as a drug or therapeutic 
target and used in screening assays, including those exemplified herein. 
Dimerized and higher multimers of CVSP1 6 polypeptides and/or portions thereof 

1 5 are provided. 

A single chain proteolytic domain can function in vitro and, hence is 
useful in in vitro assays for identifying agents that modulate the activity of 
members of this family. In addition two-chain or three chain forms, the 
activated full-length or truncated forms thereof, such as forms in which the 

20 signal peptide is removed, also can be used in such assays. Assays for 
activation also are provided. 

The CVSP1 6 polypeptides provided herein are expressed or activated or 
active in tumor cells, typically at a level that differs from the level in which they 
are expressed or activated or active in the non-tumor cell of the same type. 

25 Hence, for example, if the CVSP16 is expressed in a prostate or ovarian tumor 
cell, to be of interest herein with respect to ovarian or prostate cancer, it is 
expressed or activated or active at a different level in non-tumor prostate or 
ovarian cells. CVSP16 polypeptide is expressed in lung, colon, prostate, breast, 
uterine, ovarian and other tumor cells. 

30 As discussed, the members of this family are involved in proteolytic 

processes that are implicated in tumor development, growth and/or progression. 
This implication is based upon their functions as proteolytic enzymes in 
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processes related to ECM degradative pathways. In addition, their levels of 
expression or level of activation or their apparent activity resulting from 
substrate levels or alterations in substrates and levels thereof differs in tumor 
cells and non-tumor cells in the same tissue. Also, in some instances, the level 
5 of activation and/or expression can be altered in tumors, such as pancreas, 
stomach, uterus, lung, colon and cervical cancers, and also breast, prostate or 
leukemias. The SP, thus, can serve as a diagnostic marker for tumors. 

It is shown herein, that CVSP16s provided herein are expressed and/or 
activated in certain tumors; hence their activation or expression can serve as a 

10 diagnostic marker for tumor development, growth and/or progression. The 

CVSP1 6 also is intended for use as a drug target and used in screening assays, 
including those CSVP16s exemplified herein. The CVSP16 polypeptides 
provided herein are expressed or activated by or in tumor cells, typically at a 
level that differs from the level in which they are expressed by the non-tumor 

15 cell of the same type. Hence, for example, if the SP expressed by a prostate or 
ovarian tumor cell is to be of interest herein with respect to ovarian or prostate 
cancer, it should have an expression, extent of activation or activity that is 
different from that in non-tumor cells. CVSP1 6 is expressed in lung, colon, 
prostate, breast, uterine, ovarian and other tumor cells. 

20 Jn certain embodiments, a CVSP16 polypeptide or a portion thereof is 

detectable in a body fluid at a level that differs from its level in body fluids in a 
subject not having a tumor. In other embodiments, the polypeptide is present in 
a tumor; and a substrate or cofactor for the polypeptide is expressed at levels 
that differ from its level of expression in a non-tumor cell in the same type of 

25 tissue. 

Detection of the SP, or a protease domain thereof, in body fluids, such as 
serum, blood, saliva, cerebral spinal fluid, synovial fluid and interstitial fluids, 
urine, sweat and other such fluids and secretions, can serve as a diagnostic 
tumor marker. In particular, detection of higher levels or alternative forms of 
30 such polypeptides in a subject compared to a subject known not to have any 

neoplastic disease or compared to earlier samples from the same subject, can be 
indicative of neoplastic disease in the subject. 
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Isolated, substantially pure proteases that contain protease domains or 
catalytically active (or functionally active) portions thereof in a. single chain form 
of SPs are provided. A protease domain or plurality thereof can be included in a 
longer protein, and such longer protein is optionally the CVSP16 zymogen. In 
5 particular, exemplary protease domains include at least a sufficient portion of 
sequences of amino acids set forth in SEQ ID No. 6 (encoded by nucleotides in 
SEQ ID No. 5) to exhibit protease activity in an assay provided herein. 

The domains, fragments, derivatives or analogs of a CVSP16 that are 
functionally active are capable of exhibiting one or more functional activities 

10 associated with the CVSP16 polypeptide, such as serine protease activity, 
immunogenicity and antigenicity, are provided. CVSP16 contains a signal 
peptide sequence (aa 1 to aa 23) and two protease domains. Also included are 
substantially purified CVSP1 6 zymogen, activated multi-chain forms of the 
polypeptide, single chain protease domains, two-chain protease domains. These 

1 5 are encoded by a nucleic acid that includes sequence encoding at least one 

protease domain that exhibits proteolytic activity and that hybridizes to a nucleic 
acid molecule including the sequence of nucleotides set forth in SEQ ID No. 5, 
typically under moderate, generally under high stringency, conditions and 
generally along the full length (or at least about 70, 80 or 90%) of a protease 

20 domain. Polypeptides encoded by splice variants are also provided as long as 
the polypeptides do not include at least 5, 7, 10, 15, 20 or more contiguous 
amino acids set forth in SEQ ID No. 21 . 

CVSP16 polypeptides 
As described CVSP16 polypeptides provided herein include polypeptides: 

25 a) a polypeptide encoded by the sequence of nucleotides set forth 

in SEQ ID No. 5; 

b) a polypeptide encoded by a sequence of nucleotides that 
hybridizes under conditions of low, moderate or high stringency to the sequence 
of nucleotides set forth in SEQ ID No. 5, wherein the encoded polypeptide does 
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not include at least 5 or at least 6 or at least 7 or at least 8 contiguous or at 
least 9 or at least 10 contiguous amino acids from SEQ ID No. 21 up to all of 
SEQ ID No. 21, and particularly do not include any amino acids therefrom 
between amino acids corresponding to Gln 660 and Met 68 , of SEQ ID No. 6; 
5 c) a polypeptide that contains the sequence of amino acids set 

forth in SEQ ID No. 6, particularly, amino acids 24-752 and C-terminal truncated 
portions thereof; 

d) a polypeptide that contains at least two protease domains of a 
serine protease 16 (CVSP16) and includes at least 5 contiguous amino acids 

10 corresponding to residues 508-544 of SEQ ID No. 6. or contains the contiguous 
sequence Asn Asp Ser or Trp Asn Asp or Ser Cys Trp Asn Asp Ser or Cys Trp 
Asn Asp Ser or Gin Thr His or Leu Gin Thr His in the second protease domain; 

e) a polypeptide that contains a sequence of amino acids having at 
least about 60%, 70%, 80%, 90% or about 95% sequence identity with the 

15 sequence of amino acids set forth in SEQ ID No. 6, wherein the polypeptide does 
not include at least 5 or at least 6 or at least 7 or at least 8 contiguous or at 
least 9 or at least 10 contiguous amino acids from SEQ ID No. 21 up to all of 
SEQ ID No. 21, and particularly does not include any amino acids therefrom 
between amino acids corresponding to Gln 660 and Met 661 of SEQ ID No. 6; 

20 f) the polypeptide is encoded by a sequence of nucleotides 

that hybridizes under conditions of at least moderate, and can be high, 
stringency along at least 70% of its full length to a sequence of nucleotides than 
encodes a polypeptide of any of a)-e), wherein the polypeptide does not include 
at least 5 or at least 6 or at least 7 or at least 8 contiguous or at least 9 or at 

25 least 10 contiguous amino acids from SEQ ID No. 21 up to all of SEQ ID No. 
21, and particularly do not include any amino acids therefrom between amino 
acids corresponding to Gln 660 and Mete 61 of SEQ ID No. 6; 
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• 

g) the polypeptide has at least 60%, 70%, 80%, 90% or 
about 95% sequence identity with a polypeptide of any of a)-f), wherein the 
polypeptide does not include at least 5 or at least 6 or at least 7 or at least 8 
contiguous or at least 9 or at least 10 contiguous amino acids from SEQ ID No. 

5 21 up to all of SEQ ID No. 21, and particularly do not include any amino acids 
therefrom between amino acids corresponding to Gln 660 and Met 661 of SEQ ID 
No. 6; and/or 

h) a polypeptide encoded by a splice variant of a sequence of 
nucleotides that encodes a CVSP16 polypeptide provided herein. Smaller 

10 portions thereof that retain protease activity are contemplated. 

An exemplary CVSP1 6 polypeptide includes the sequence of amino acids 
set forth in SEQ ID No. 6; the protease domains are included in the sequence of 
amino acids set forth in SEQ ID No. 6. 

Also, provided herein are isolated substantially pure polypeptides that 

15 contain a protease domain of a CVSP16. The polypeptide protein also can 
include other non-CVSP16 sequences of amino acids, but includes a protease 
domain or a sufficient portion thereof to exhibit catalytic activity (or other 
functional activity, such as substrate or ligand binding activity) in any in vitro 
assay that assess such protease activity, such as any provided herein. The 

20 CVSP16 polypeptides do not include the sequence of amino acids set forth in 
SEQ ID No. 21 and/or they, do not include at least 5, 7, 9, 10, 1 1 , 1 5, 20 or 
more contiguous amino acids thereof between amino acids that correspond to 
Q660 and M661 of SEQ ID No. 6 and/or the CVSP16 polypeptide contains at 
least two protease domains of a serine protease 1 6 (CVSP1 6) and includes at 

25 least 5 contiguous amino acids corresponding to residues 508-544 of SEQ ID 

No. 6. (SEQ ID No. 22) or contains the contiguous sequence Asn Asp Ser or Trp 
Asn Asp or Ser Cys Trp Asn Asp Ser or Cys Trp Asn Asp Ser or Gin Thr His or 
Leu Gin Thr His in the second protease domain. Such polypeptides are 
zymogens or activated single- two- or three-chain molecules. 

30 Provided are substantially purified CVSP16 zymogens, activated two and 

three chain forms, single chain protease domains and two-chain protease 
domains. A full-length exemplary CVSP1 6 polypeptide, including the signal 
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sequence, is set forth in SEQ ID No. 6 The signal sequence {see amino acids 1- 
23, SEQ ID No. 6} can be cleaved upon expression or the encoding nucleic acid 
can be deleted prior to expression. 

Also provided is a substantially purified protein including a sequence of 
5 amino acids that has at least 60%, 70%, 80%, 90% or about 95%, identity to 
the exemplified CVSP16 or to a protease domain thereof, and does not include 
at least 5, 7 r 9, 11, 15, 20 or more contiguous amino acids of SEQ ID No. 21 . 
Percentage identity is determined using standard algorithms and gap penalties 
that maximize the percentage identity. A human CVSP16 polypeptide is 

10 exemplified, although other mammalia CVSP16 polypeptides are contemplated 
and are obtained by standard methods using the disclosed CVSP1 6-encoding 
nucleic acid (or antibodies made to the CVSP1 6) to isolate corresponding nucleic 
acid molecules (and/or CVSP16s) from other species. Polypeptides peptides 
encoded by splice variants of the exemplified encoding nucleic acid (SEQ ID No. 

15 5), particularly those with a proteolytically active protease domain, but not 

containing at least 5, 7, 10, 15, 20 or more contiguous amino acids from SEQ ID 
No. 21, particularly inserted between the amino acids corresponding to Q660 
and M661 (SEQ ID No. 6) are provided. 

Provided are substantially purified CVSP1 6 polypeptides and functional 

20 domains thereof, including catalytically active domains and portions, that have at 
least about 60%, 70%, 80%, 90% or about 95% sequence identity with a 
CVSP16 that includes ^the sequence of amino acids set forth in SEQ ID No. 6 or 
a catalytically active portion thereof. The CVSP1 6 polypeptides provided herein 
do not include at least 5 contiguous amino acid residues as set forth in SEQ ID 

25 No. 21, particularly between residues corresponding to Q^o and M 661 of SEQ ID 
No. 6. Also provided are polypeptides that are encoded by the nucleic acid 
molecules provided herein. Polypeptides that contain at least two protease 
domains of a serine protease 1 6 (CVSP1 6) and includes at least 5 contiguous 
amino acids corresponding to residues 508-544 of SEQ ID No. 6. or contains the 

30 contiguous sequence Asn Asp Ser or Trp Asn Asp or Ser Cys Trp Asn Asp Ser 
or Cys Trp Asn Asp Ser or Gin Thr His or Leu Gin Thr His in the second protease 
domain. 
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ln other embodiments, substantially purified polypeptides that include a 
protease domain of a CVSP16 polypeptide or a catalytically active portion 
thereof, but that do not include the entire sequence of amino acids set forth in 
SEQ ID No. 6 are provided. Among these are polypeptides that include a 
5 sequence of amino acids that has at least 60%, 70%, 80%, 90%, 95% or 
100% sequence identity to SEQ ID No. 6. 

Included among the polypeptides provided herein is a CVSP16 protease 
domain and/or a polypeptide with amino acid changes such that the specificity 
and protease activity (or other functional activity) is not eliminated and is 
10 retained at least 1 %, 2%, 3%, 5%, 10%, 20%, 30%, 40%, 50% activity or 
remains substantially unchanged (more than about 50%) or increases. The 
CVSP16 polypeptide can form homodimers and also can form heterodimers with 
some other protein, such as a membrane-bound protein. 

Domains, fragments, derivatives or analogs of a CVSP16 that are 
1 5 functionally active are capable of exhibiting one or more functional activities 
and/or other activities or properties, such as immunogenicity and antigenicity, 
also are provided. 

Protease domains and antigenic fragments 
Isolated, substantially pure polypeptides that include the protease 
20 domains or catalytically active portions thereof as single chain forms of SPs are 
provided. The protease domains can be included in a longer protein, and such 
longer proteins include up to a full-length CVSP1 6 zymogen as long as the full- 
length polypeptide does not include SEQ ID No. 21. Provided herein are isolated 
substantially pure polypeptides that contain a protease domain of a CVSP16 as a 
25 single chain. 

SP protease domains include the single-chain protease domains of 
CVSP16. Provided are the protease domains or proteins that include a portion of 
a SP that is the protease domain of any SP, particularly a CVSP16. The protein 
also can include other non-SP sequences of amino acids, but includes the 
30 protease domain or a sufficient portion thereof to exhibit catalytic and/or binding 
activity in any in vitro assay that assesses such activity(ies), such as any 
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provided herein. Also provided are two- and three-chain activated forms of the 

full length protease and also two chain forms of a protease domain. 

In an embodiment, the substantially purified CVSP1 6 protease is encoded 

by a nucleic acid that hybridizes to a nucleic acid molecule containing the 
5 protease domain encoded by the nucleotide sequence set forth in SEQ. ID No. 5 

under at least moderate, generally high, stringency conditions, such that the 

protease domain encoding nucleic acid thereof hybridizes along its full length or 

along at least about 70%, 80% or 90% of the full length. In other embodiments 

the substantially purified CVSP16 protease is a single chain polypeptide that 
1 0 includes substantially the sequence of amino acids set forth as amino acids 24- 

752 in SEQ ID No. 6, or a catalytically active (or functionally active or 

immunogenic or antigenic) portion thereof. 

Polypeptides that additionally include amino acids at the C-terminus, such 

as all or a portion of the amino acids following the protease domain (aa 551 to 
15 aa 752 in SEQ ID No. 6) in the exemplified embodiment are provided. Dimers 

and other multimers of the full length and catalytically active portions of the 

polypeptides that include PD1 and/or PD2 are provided. 

A signal peptide (amino acids 1 -23 of SEQ ID No. 6 in the exemplified 

embodiment) also is provided. In addition the mature CVSP1 6 polypeptide with 
20 the signal sequence removed and catalytically active portions thereof, including 

those that include all or a portion of the C-terminus beyond the protease domain 

are provided. 

As described below, all forms of the CVSP16, including the pro- 
polypeptide with the signal sequence, the mature polypeptide and catalytically 

25 active portions thereof, the protease domains and catalytically active portions 
thereof, three-chain, two-chain and single chain forms of any of these 
polypeptides are provided herein and can be used in the screening assays and for 
preparing specific antibodies therefor. The expression, quantity, activity and/or 
activation of the polypeptides in tumor cells and body fluids can be diagnostic of 

30 disease or its absence. 
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CVSP16 has two protease domains PD1 and PD2 (see, e.g., amino acids 
46 to 286 an amino acids 323 to 550 of SEQ ID No. 6, respectively). 
Polypeptides that contain PD1 or PD2 as the only CVSP16 portion thereof are 
provided or that contain PD1 and PD2 but do not include at least 5, 7, 10, 15, 
5 20 or more contiguous amino acids of SEQ ID No. 21 are provided. 

With reference to SEQ ID No. 6 of the exemplified CVSP1 6, the 
exemplifiedCVSP16 has 8 putative /V-linked glycosylation sites {...N 92 GT..., 

...N 130 YS N 217 LT..., ...N 317 CT..., ...N 369 SS N^AS N 421 LS..„ 

...N 508 DS...). The following cysteine pairings are noted: C 72 -C 88 , C 173 -C 249 , C 206 - 

10 C^e, C 239 -C 267 , C 348 -C 364 , CWC^e, C 472 -C 494 and C 506 -C 534 . In addition, an 

unpaired cysteine (C 15g ) in the first protease domain should pair with C 38 . An 
unpaired cysteine (C 430 ) in the second protease domain should pair with C 325 or 
C 318 . As a result the protease, upon activation cleavage can contain multiple 
chains, including two or three or more chains, formed by virtue of pairing 

1 5 between the unpaired cysteine in a protease domain with a cysteine outside a 
protease domain. 

The first protease domain is characterized by the presence of a protease 
activation cleavage site {...R 46 4 l 47 VGGSNAQP..., where 4 indicates protease 
activation cleavage site) at the beginning of the domain and catalytic triad 

20 residues (H 87 , D 139 and S 243 ) in 3 highly-conserved regions of the catalytic 
domain. In the second protease domain, the invariant catalytic histidine is 
replaced by a serine (S 363 ) residue, and the highly conserved SGGP sequence 
that contains the catalytic serine is replaced with the sequence S 510 RWS. In 
addition, PD2 begins at amino acid 323-324 {Pro 323 Glu 324 ), which provides an 

25 unusual cleavage site. Since Pro-Glu residues as a cleavage site is unusual, 
cleavage may not be needed for activation. These sequences and differences 
from other protease domains indicate that the second protease domain has lower 
catalytic activity. 
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Antigenic epitopes 

Antigenic epitopes that contain at least 4, 5, 6, 7, 8, 9, 1 0, 1 1 , 1 2, 1 3, 

14, 15, 20, 25, 30, 40, 50, and typically 10-15 amino acids of the CVSP16 
polypeptide are provided. These antigenic epitopes are used, for example, to 

5 raise antibodies. Antibodies, see further discussion below, specific for each 

epitope or combinations thereof are also provided. Also provided are antibodies 
that bind with at least 1 0-fold or 1 00-fold greater affinity to CVSP1 6 
polypeptides that do not include the sequence of amino acids set forth in SEQ ID 
No. 21 (particularly do not include any amino acids therefrom between amino 
10 acids corresponding to Gln 660 and Mete 61 of SEQ ID No. 6) compared to those 
that include SEQ ID No. 21. In particular, provided are antibodies that bind to a 
CVSP16 polypeptide provided herein that does not include at least 5, 7, 9, 10, 

15, 20 contiguous amino acids of SEQ ID No. 21, particularly where the 
contiguous amino acids are inserted between amino acids corresponding to Q 660 

1 5 and M 661 , with at least 2-, 5-, 1 0-, 1 00-fold greater affinity than to a polypeptide 
that includes the least 5 contiguous amino acids set forth in SEQ ID No. 21, 
particularly where the continuous amino acids are inserted between amino acids 
corresponding to Q 660 and M 6B1 . 

Muteins and derivatives of CVSP16 polypeptides 

20 Full-length CVSP16, zymogen and activated forms thereof and CVSP16 

protease domains, portions thereof, and muteins and derivatives of such 
polypeptides are provided. Among the derivatives are those based on animal 
CVSP1 6s, including, but are not limited to, rodent, such as mouse and rat; fowl, 
such as chicken; ruminants, such as goats, cows, deer, sheep; ovine, such as 

25 pigs; and humans. For example, CVSP1 6 derivatives can be made by altering 
their sequences by substitutions, additions or deletions. CVSP16 derivatives 
include, but are not limited to, those containing, as a primary amino acid 
sequence, all or part of the amino acid sequence of CVSP1 6, including altered 
sequences in which functionally equivalent amino acid residues are substituted 

30 for residues within the sequence resulting in a silent change. For example, one 
or more amino acid residues within the sequence can be substituted by another 
amino acid of a similar polarity which acts as a functional equivalent, resulting in 



WO 2004/005471 



PCTAJS2003/020959 



-60- 

a silent alteration. Substitutes for an amino acid within the sequence can be 
selected from other members of the class to which the amino acid belongs. For 
example, the nonpolar (hydrophobic) amino acids include alanine, leucine, 
isoleucine, valine, proline, phenylalanine, tryptophan and methionine. The polar 
5 neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, 

asparagine, and glutamine. The positively charged (basic) amino acids include 
arginine, lysine and histidine. The negatively charged (acidic) amino acids 
include aspartic acid and glutamic acid (see, e.g., Table 1). Muteins of the 
CVSP1 6 or a domain thereof, such as a protease domain, in which up to about 

10 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 85%, 90% or 95% of the 

amino acids are replaced with another amino acid are provided. Generally such 
muteins retain at least about 1%, 2%, 3,%, 5%, 7%, 8%, 10%, 20%, 30%, 
40%, 50%, 60%, 70%, 80%, 90%, 95% or more (or in increased activity, i.e., 
101, 102, 103, 104, 105, 110% or greater) of the protease activity of the 

1 5 unmutated protein. Those of skill in the art recognize that a polypeptide that 
retains at least 1 % of the activity of the wild-type protease is sufficiently active 
for use in screening assays or in other applications. 

C VSP1 6 derivatives can be made by altering their sequences by 
substitutions, additions or deletions that provide for functionally equivalent 

20 molecules. Due to the degeneracy of nucleotide coding sequences, other nucleic 
sequences which encode substantially the same amino acid sequence as a 
CVSP1 6 are provided. These include but are not limited to nucleotide sequences 
containing all or portions of CVSP1 6 genes that are altered by the substitution of 
different codons that encode the amino acid residue within the sequence, thus 

25 producing a silent change. CVSP16 derivatives include, but are not limited to, 

those containing, as a primary amino acid sequence, all or part of the amino acid 
sequence of SP, including altered sequences in which amino acid residues are 
substituted for residues within the sequence resulting in a silent change or a 
change that eliminates an activity. Substitutes for an amino acid within the 

30 sequence can be selected from other members of the class to which the amino 
acid belongs or from another class (for non-conservative changes). 
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Muteins in which one or more of the Cys residues, particularly, a residue 
that is paired in the activated two-chain or three-chain form, but unpaired in the 
protease domain alone (i.e., a Cys at residue no. 159 and/or 430 in SEQ ID No. 6) 
is/are replaced with any amino acid, typically, although not necessarily, a 
5 conservative amino acid residue, such as Ser, are contemplated. Muteins in 
which 10%, 20%, 30%, 35%, 40%, 45%, 50% or more of the amino acids are 
replaced but the resulting polypeptide retains at least about 1 %, 3%, 5%, 10%, 
20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 90% or 95% of the 
catalytic activity (or other functional activity or other property, such as 
10 antigenicity or immunogenicity) as the unmodified form for the same substrate 

also are provided. 

Nucleic acid molecules, vectors and plasmids, cells and expression 
of CVS PI 6 polypeptides 

Also provided herein are nucleic acid molecules that encode SP proteins 

15 and the encoded proteins. In particular, nucleic acid molecules encoding CVSP16 
from animals, including splice variants thereof are provided. The encoded proteins 
are also provided. Also provided are functional domains thereof. For example, 
the SP (CVSP16) protease domains, portions thereof, and muteins thereof are 
from or based on animal SPs (CVSPs), including, but are not limited to, rodent, 

20 such as mouse and rat; fowl, such as chicken; ruminants, such as goats, cows, 
deer, sheep; ovine, such as pigs; and humans. The isolated nucleic acid fragment 
is DNA, including genomic or cDNA, or is RNA, or can include other components, 
such as PNA. The isolated nucleic acid can include additional components, such 
as heterologous or native promoters, and other transcriptional and translational 

25 regulatory sequences, these genes can be linked to other genes, such as reporter 
genes or other indicator genes or genes that encode indicators. 

Nucleic acid molecules 
Due to the degeneracy of nucleotide coding sequences, other nucleic acid 
sequences that encode substantially the same amino acid sequence as CVSP1 6 

30 gene (or cDNA or RNA) can be used. These include but are not limited to 

nucleotide sequences comprising all or portions of CVSP16 genes that are altered 
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by the substitution of different codons that encode the amino acid residue within 
the sequence, thus producing a silent change. 

Also provided are nucleic acid molecules that hybridize to the above-noted 
sequences of nucleotides encoding CVSP16 at least at low stringency, at 
5 moderate stringency, and/or at high stringency, and that encode the protease 
domain and/or the full length protein or other domains of a CVSP16 or a splice 
variant or allelic variant thereof. Generally the molecules hybridize under such 
conditions along their full length (or along at least about 70%, 80% or 90% of the 
full length) for at least one domain and encode at least one domain, such as the 

10 protease domain, of the polypeptide. In particular, such nucleic acid molecules 
include any isolated nucleic acid fragment that encodes at least one domain of a 
serine protease, that (1) contains a sequence of nucleotides that encodes the 
protease or a functionally active, such as catalytically active, domain thereof, and 
(2) is selected from among sequences of nucleic acids that encode a CVSP16 

1 5 polypeptide: 

a) a polypeptide encoded by the sequence of nucleotides set forth in SEQ 
ID No. 5; 

b) a polypeptide encoded by a sequence of nucleotides that hybridizes 
under conditions of low, moderate or high stringency to the sequence of 

20 nucleotides set forth in SEQ ID No. 5, wherein the encoded polypeptide does not 
include at least 5, 6, 7, 8, 9, 15 or 20 contiguous amino acids from SEQ ID No. 
21 up to all of SEQ ID No. 21, and particularly do not include the contiguous 
amino acids therefrom between amino acids corresponding to Gln^o and Met 661 of 
SEQ ID No. 6; 

25 c) a polypeptide that contains the sequence of amino acids set forth in SEQ 

ID No. 6 or that contains amino acid residues 24-752 or functionally active, 
particularly, catalytically active, portions thereof; 

d) a polypeptide that contains a sequence of amino acids having at least 
about 60%, 70%, 80%, 90% or about 95% sequence identity with the sequence 

30 of amino acids set forth in SEQ ID No. 6, where the polypeptide does not include 
at least at least 5, 6, 7, 8, 9, 10, 15 or 20 contiguous amino acids from SEQ ID 
No. 21 up to all of SEQ ID No. 21, and particularly do not include any amino acids 
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therefrom between amino acids corresponding to Gln 660 and Mete 61 of SEQ ID No. 
6; 

e) a polypeptide that contains at least two protease domains of a 
serine protease 16 (CVSP16) and includes at least 5 contiguous amino acids 

5 corresponding to residues 508-544 of SEQ ID No. 6. or contains the contiguous 
sequence Asn Asp Ser or Trp Asn Asp or Ser Cys Trp Asn Asp Ser or Cys Trp 
Asn Asp Ser or Gin Thr His or Leu Gin Thr His in the second protease domain; 

f) a polypeptide encoded by a sequence of nucleotides that hybridizes 
under conditions of at least moderate, and can be high, stringency along at least 

10 70% of its full length to a sequence of nucleotides that encodes a polypeptide of 
any of a)-e), wherein the polypeptide does not include at least 5, 6, 7, 8, 9, 10, 
15 or 20 contiguous amino acids contiguous amino acids from SEQ ID No. 21 up 
to all of SEQ ID No. 21, and particularly does not include any amino acids 
therefrom between amino acids corresponding to Gln 660 and Met^ of SEQ ID No. 

15 6; 

g) the polypeptide has at least 60%, 70%, 80%, 90% or about 95% 
sequence identity with a polypeptide of any of a)-), where the polypeptide does 
not include at least about 5, 6, 7, 8, 9, 10, 1 5 or 20 contiguous amino acids from 
SEQ ID No. 21 up to all of SEQ ID No. 21, and particularly do not include any 

20 amino acids therefrom between amino acids corresponding to GIn 660 and Met 661 of 

SEQ ID No. 6; and/or 

h) a polypeptide encoded by a splice variant of a sequence of nucleotides 
that encodes a CVSP16 polypeptide, including a polypeptide of any of a)-f), as 
long as the resulting polypeptide the polypeptide does not include at least about 

25 5, 6, 7, 8, 9, 10, 15 or 20 contiguous amino acids from SEQ ID No. 21 up to all 
of SEQ ID No. 21 , and particularly does not include any amino acids therefrom 
between amino acids corresponding to Gln 660 and Met 661 of SEQ ID No. 6 and/or 
the polypeptide encoded by the splice variant contains at least two protease 
domains of a serine protease 1 6 (CVSP1 6) and includes at least 5 contiguous 

30 amino acids corresponding to residues 508-544 of SEQ ID No. 6. or contains the 
contiguous sequence Asn Asp Ser or Trp Asn Asp or Ser Cys Trp Asn Asp Ser or 
Cys Trp Asn Asp Ser or Gin Thr His or Leu Gin Thr His in the second protease 
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domain. Smaller nucleic acid molecules that encode polypeptides that retain 
protease activity as single chains or as other truncated single-chain forms are 
provided. 

Hence, among the nucleic acid molecules are those that do not encode a 
5 polypeptide that includes at least about 5, 6, 7, 8, 9, 10, 15 or 20 contiguous 
amino acids from SEQ ID No. 21 up to all of SEQ ID No. 21, and particularly do 
not include any amino acids therefrom between amino acids corresponding to 
Gln 660 and Met 661 of SEQ ID No. 6 and contain: 

(a) a sequence of nucleotides that encodes the CVSP16 polypeptide or 
10 a domain thereof, particularly a CVSP16 polypeptide of a)-h) above; 

(b) a sequence of nucleotides that encodes such portion or the full 
length CVSP16 protease and hybridizes under conditions of 
moderate or high stringency, generally to nucleic acid that is 
complementary to an mRNA transcript present in a mammalian cell 

1 5 that encodes a mammalian CVSP1 6 protein or a fragment thereof; 

(c) a sequence of nucleotides that encodes a CVSP1 6 protease or a 
domain thereof that includes a sequence of amino acids encoded by 
such portion or the full length open reading frame; and 

(d) a sequence of nucleotides that encodes a CVSP1 6 protease that 
20 includes a sequence of amino acids encoded by a sequence of 

nucleotides that encodes such protease and hybridizes under 
conditions of moderate or high stringency to DNA that is 
complementary to a mRNA transcript. 

(e) a sequence of nucleotides that includes degenerate codons of all or 
25 a portion of any of {a)-{d). 

The isolated nucleic acid fragment is DNA, including genomic or cDNA, or 
is RNA, or can include other components, such as peptide nucleic acid (PNA) or 
other modified nucleotides or ribonucleotides or analogs thereof. The isolated 
nucleic acid can include additional components, such as heterologous or native 
30 promoters, and other transcriptional and translational regulatory sequences, these 
genes can be linked to other genes, such as reporter genes or other indicator 
genes or genes that encode indicators. 
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In an exemplary embodiment, a nucleic acid molecule that encodes a 
CVSP16 polypeptide is provided. In particular, an encoding nucleic acid molecule 
with an open reading frame within the sequence of nucleotides set forth in SEQ ID 
No. 5 is provided. The encoded protein is set forth in SEQ ID NO. 6 (see, also 
5 EXAMPLE 1). Also provided a nucleic acid molecules that encode the mature 
polypeptide (residues 24-752) and one or both protease domains. The encoded 
polypeptide does not include at least about 5, 6, 7, 8, 9, 10, 15 or 20 contiguous 
amino acids from SEQ ID No. 21 up to all of SEQ ID No. 21 , and particularly do 
not include any amino acids therefrom between amino acids corresponding to 

10 Gln 660 and Met 661 of SEQ ID No. 6. 

Also provided are nucleic acid molecules that hybridize under conditions of 
at least low stringency, typically moderate stringency, and generally high 
stringency, and typically along at least 70%, 80%, 90% or 95% of their full 
length to the above sequence of nucleic acids (SEQ ID No. 5 or degenerates 

15 thereof), particularly to the open reading frame encompassed by nucleotides that 
encode a single protease domain or any domain of CVSP16, and/or to any of the 
CVSP1 6 polypeptides and fragments thereof as described herein. 

Also provided are nucleic acid molecules that encode a single chain 
CVSP16 protease that has proteolytic activity in an in vitro proteolysis assay and 

20 that encode a polypeptide that has at least 80%, 85%, 90% or 95% sequence 
identity with the full length CVSP16 polypeptide of SEQ ID No. 6. 

Also provided are nucleic acid molecules that encode a single chain 
CVSP1 6 protease that has proteolytic activity in an in vitro proteolysis assay and 
that encode a polypeptide that has at least 60%, 70%, 80%, 85%, 90% or 95% 

25 sequence identity with the full length of a protease domain of the CVSP1 6 

polypeptide of SEQ ID No. 6, or that hybridize along their full length to a nucleic 
acids that encodes a protease domain, particularly under conditions of moderate, 
generally high, stringency. As above, the encoded polypeptides contain the 
protease as a single chain. 

30 The isolated nucleic acids can include at least 8 nucleotides of a CVSP1 6- 

encoding sequence. In other embodiments, the nucleic acids can contain at least 
10, 14, 16, 50, 100 nucleotides, 150 nucleotides, or 200 nucleotides of a 
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CVSP1 6-encoding sequence provided that the nucleic acid molecules includes 
codons encoding Q660 and M661 as contiguous amino acids, particular those 
encoding amino acids 655-665 or smaller portions thereof that include Qg 60 and 
M 661 of SEQ ID No. 6. The nucleic acid molecules generally include sequence of 
5 nucleotides that encode residues that correspond to and M 6Q1 of SEQ ID No. 
6. 

For each of the nucleic acid molecules, the nucleic acid can be DNA or 
RNA or PNA or other nucleic acid analogs or can include non-natural nucleotide 
bases. Also provided are isolated nucleic acid molecules that include a sequence 

10 of nucleotides complementary to the nucleotide sequence encoding a SP. 

Probes, primers, antisense oligonucleotides and dsRNA 
Also provided are fragments thereof or oligonucleotides that can be used 
as probes or primers and that contain at least about 10, 14, 16 nucleotides, 
generally less than 1000 or less than or equal to 100, set forth in SEQ ID No. 5 

1 5 (or the complement thereof); or contain at least about 30 nucleotides (or the 
complement thereof) or contain oligonucleotides that hybridize along their full 
length or along at least about 70%, 80% or 90% of their full length to any such 
fragments or oligonucleotides. The length of the fragments is a function of the 
purpose for which they are used and/or the complexity of the genome of interest. 

20 Generally probes and primers contain less than about 500, 150, 100 nucleotides. 
The probes and primers generally span residues that correspond to Q 660 and M 661 . 

Probes and primers derived from the nucleic acid molecules are provided. 
Such probes and primers contain at least 8, 14, 16, 30, 100 or more contiguous 
nucleotides with identity to contiguous nucleotides of a CVSP1 6, particularly 

25 those that span nucleotides corresponding to 1978-1983 of SEQ ID No. 5. The 
probes and primers are optionally labelled with a detectable label, such as a 
radiolabel or a fluorescent tag, or can be mass differentiated for detection by mass 
spectrometry or other means. 

Also provided are fragments thereof or oligonucleotides that can be used 

30 as probes or primers and that contain at least about 10, 14, 16 nucleotides, 

generally less than 1000 or less than or equal to 100, set forth in SEQ ID No. 5 
(or the complement thereof), particularly those that span nt. 1978-1983; or 
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contain at least about 30 nucleotides {or the complement thereof) or contain 
oligonucleotides that hybridize along their full length (or at least about 70, 80 or 
90% thereof) to any such fragments or oligonucleotides. The length of the 
fragments are a function of the purpose for which they are used and/or the 
5 complexity of the genome of interest. Generally probes and primers contain less 
than about 500, 150, 100 nucleotides. 

Also provided is an isolated nucleic acid molecule that includes the 
sequence of molecules that is complementary to the nucleotide sequence 
encoding CVSP16 or the portion thereof. Double-stranded RNA (dsRNA), such as 

10 RNAi also is provided. 

Plasmids, vectors and cells 
Plasmids and vectors containing the nucleic acid molecules are also 
provided. Cells containing the vectors, including cells that express the encoded 
proteins are provided. The cell can be a bacterial cell, a yeast cell, a fungal cell, a 

15 plant cell, an insect cell or an animal cell. Methods for producing a SP or single 
chain form of the protease domain thereof by, for example, growing the cell 
under conditions whereby the encoded SP is expressed by the cell, and recovering 
the expressed protein, are provided herein. As noted, for CVSP16, the full-length 
zymogens and activated proteins and activated (two- or three-chain) protease and 

20 single chain protease domains are provided. 

As discussed below, the CVSP1 6 polypeptide, and catalytically active 
portions thereof, can be expressed as a secreted protein using the native signal 
sequence or a heterologous signal. Alternatively the protein can be expressed as 
inclusion bodies in the cytoplasm and isolated therefrom. The resulting protein 

25 can be treated to refold (see, e.g., EXAMPLE 1). Active protease domain can be 
produced by expression in inclusion bodies, isolation therefrom and denaturation 
followed by refolding. 

C. Tumor specificity and tissue expression profiles 

Each SP has a characteristic tissue expression profile; the SPs in particular, 
30 although not exclusively expressed or activated in tumors, exhibit characteristic 
tumor tissue expression or activation profiles. In some instances, SPs can have 
different activity in a tumor cell from a non-tumor cell by virtue of a change in a 
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substrate or cofactor therefor or other factor that would alter the apparent 
functional activity of the SP. Hence each can serve as a diagnostic marker for 
particular tumors, by virtue of a level of activity and/or expression or function in a 
subject (i.e. a mammal, particularly a human) with neoplastic disease, compared 
5 to a subject or subjects that do not have the neoplastic disease. In addition, 

detection of activity (and/or expression) in a particular tissue can be indicative of 

neoplastic disease. 

Circulating CSVP1 6s in body fluids can be indicative of neoplastic disease. 
Secreted CVSP16 or activated CVSP16 can be indicative of neoplastic disease. 
10 Also, by virtue of an activity and/or expression profile of each, CVP1 6s can serve 
as therapeutic targets, such as by administration of modulators of an activity 
thereof, or, as by administration of a prodrug specifically activated by one of the 
CVSP1 6s. 

Tissue expression profiles 
15 CVSP16 

Results indicate that the CVSP16 transcript is strongly expressed in several 
tissues including kidney, stomach, colon, spleen, thyroid gland, trachea and 
pituitary gland. The CVSP1 6 transcript also is found in many other tissues at a 
lower level. Among tumor cell lines, the CVSP1 6 transcript occurs, for example, 

20 in cervical Hela S3, lung A549, leukemia K-562, Burkitt's lymphoma Raji, 
leukemia HL-60, colorectal SW480, Burkitt's lymphoma Daudi and leukemia 
MOLT-4. The CVSP1 6 transcript also was detected {in decreasing signal 
intensity) in normal breast, normal prostate, breast carcinoma cell line DU4475, 
prostate carcinoma cell line PC-3, prostate carcinoma cell line LNCaP, breast 

25 carcinoma cell line MDA-MB-231, and breast carcinoma cell line MDA-MB-453. 
Tissue expression profiles are described in Example 1 . 
D. Identification and isolation of SP protein genes 

The SP polypeptides, including CVSP16 polypeptides, or domains thereof, 
can be obtained by methods well known in the art for protein purification and 

30 recombinant protein expression. Any method known to those of skill in the art 
for identification of nucleic acids that encode desired genes can be used. Any 
method available in the art can be used to obtain a full length [i.e., encompassing 
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the entire coding region) cDNA or genomic DNA clone encoding a SP protein. In 
particular, the polymerase chain reaction (PCR) can be used to amplify a sequence 
identified as being differentially expressed or encoding proteins activated at 
different levels in tumor and non-tumor cells or tissues, e.g., nucleic acids 
5 encoding a CVSP16 polypeptide (SEQ. NOs: 5, 6, 12 and 13), in a genomic or 
cDNA library. Oligonucleotide primers that hybridize to sequences at the 3' and 
5' termini of the identified sequences can be used as primers to amplify by PCR 
sequences from a nucleic acid sample (RNA or DNA), typically a cDNA library, 
from an appropriate source (e.g., tumor or cancer tissue). 

10 Amplification, such as PCR, can be carried out by a thermal cycler and 

thermostable DNA polymerase. The nucleic acid that is amplified can include 
mRNA or cDNA or genomic DNA from any eukaryotic species. One can choose to 
synthesize several different degenerate primers, for use in the PCR reactions. It 
also is possible to vary the stringency of hybridization conditions used in priming 

15 the PCR reactions, to amplify nucleic acid orthologs or homologs (e.g., to obtain 
SP protein sequences from species other than humans or to obtain human 
sequences with homology to CVSP1 6 polypeptide) by allowing for greater or 
lesser degrees of nucleotide sequence similarity between the known nucleotide 
sequence and the nucleic acid homoiog being isolated. For cross species 

20 hybridization, low or moderate stringency conditions are used. For same species 
hybridization, moderate or high stringency conditions generally are used. After 
successful amplification of the nucleic acid containing all or a portion of the 
identified CVSP16 protein sequence or of a nucleic acid encoding all or a portion 
of a CVSP16 protein homoiog, that segment can be molecularly cloned and 

25 sequenced, and used as a probe to isolate a complete cDNA or genomic clone. 
This, in turn, permits the determination of the gene's complete nucleotide 
sequence, the analysis of its expression, and the production of its protein product 
for functional analysis. Once the nucleotide sequence is determined, an open 
reading frame encoding the SP protein gene protein product can be determined by 

30 any method well known in the art for determining open reading frames, for 
example, using publicly available computer programs for nucleotide sequence 
analysis. Once an open reading frame is defined, it is routine to determine the 
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amino acid sequence of the protein encoded by the open reading frame. In this 
way, the nucleotide sequences of the entire SP protein genes as well as the amino 
acid sequences of SP proteins and analogs can be identified. 

Any eukaryotic cell potentially can serve as the nucleic acid source for the 
5 molecular cloning of the SP protein gene. The nucleic acids can be isolated from 
vertebrate, mammalian, human, porcine, bovine, feline, avian, equine, canine, as 
well as additional primate sources, insects, plants, and other sources. The DNA 
can be obtained by standard procedures known in the art from cloned DNA (e.g., 
a DNA "library"), by chemical synthesis, by cDNA cloning, or by the cloning of 

10 genomic DNA, or fragments thereof, purified from the desired cell (see, e.g., 

Sambrook et al. (2001) Molecular Cloning: A Laboratory Manual (3rd Ed.), Vol. 1- 
3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.). Clones derived from 
genomic DNA can contain regulatory and intron DNA regions in addition to coding 
regions; clones derived from cDNA contains only exon sequences. Whatever the 

1 5 source, the gene should be molecularly cloned into a suitable vector for 
propagation of the gene. 

In the molecular cloning of the gene from genomic DNA, DNA fragments 
are generated, some of which encode the desired gene. The DNA can be cleaved 
at specific sites using various restriction enzymes. Alternatively, one can use 

20 DNAse in the presence of manganese to fragment the DNA, or the DNA can be 
physically sheared, for example, by sonicafion. The linear DNA fragments can 
then be separated according to size by standard techniques, including but not 
limited to, agarose and polyacrylamide gel electrophoresis and column 
chromatography. 

25 Once the DNA fragments are generated, identification of the specific DNA 

fragment containing the desired gene can be accomplished in a number of ways. 
For example, a portion of the SP protein (of any species) gene [e.g., a PCR 
amplification product obtained as described above or an oligonucleotide having a 
sequence of a portion of the known nucleotide sequence) or its specific RNA, or a 

30 fragment thereof can be purified and labeled, and the generated DNA fragments 
can be screened by nucleic acid hybridization to the labeled probe (Benton and 
Davis, Science 736:180 (1977); Grunstein and Hogness, Proc. Natl. Acad. Sci. 



WO 2004/005471 



PCT/US2003/020959 



-71- 

U.S.A. 72:3961 (1975)). Those DNA fragments with substantial homology to the 
probe hybridize. It also is possible to identify the appropriate fragment by 
restriction enzyme digestion(s) and comparison of fragment sizes with those 
expected according to a known restriction map if such is available or by DNA 
5 sequence analysis and comparison to the known nucleotide sequence of SP 

protein. Further selection can be carried out on the basis of the properties of the 
gene. Alternatively, the presence of the gene can be detected by assays based 
on the physical, chemical, or immunological properties of its expressed product. 
For example, cDNA clones, or DNA clones which hybrid-select the proper mRNA, 
10 can be selected which produce a protein that, e.g., has similar or identical 
electrophoretic migration, isoelectric focusing behavior, proteolytic digestion 
maps, antigenic properties, serine protease activity. If an anti-SP protein antibody 
is available, the protein can be identified by binding of labeled antibody to the 
putatively SP protein synthesizing clones, in an ELISA (enzyme-linked 
1 5 immunosorbent assay)-type procedure. 

Alternatives to isolating the CVSP1 6 polypeptide genomic DNA include, 
but are not limited to, chemically synthesizing the gene sequence from a known 
sequence or making cDNA to the mRNA that encodes the SP protein. For 
example, RNA for cDNA cloning of the SP protein gene can be isolated from cells 
20 expressing the protein. The identified and isolated nucleic acids can then be 
inserted into an appropriate cloning vector. A large number of vector-host 
systems known in the art can be used. Possible vectors include, but are not 
limited to, plasmids or modified viruses, but the vector system must be 
compatible with the host cell used. Such vectors include, but are not limited to, 
25 bacteriophages such as lambda derivatives, or' plasmids such as pBR322 or pUC 
plasmid derivatives or the Bluescript vector (Stratagene, La Jolla, CA). The 
insertion into a cloning vector can, for example, be accomplished by ligating the 
DNA fragment into a cloning vector which has complementary cohesive termini. 
Insertion can be effected using TOPO cloning vectors (INVITROGEN, Carlsbad, 
30 CA). If the complementary restriction sites used to fragment the DNA are not 

present in the cloning vector, the ends of the DNA molecules can be enzymatically 
modified. Alternatively, any site desired can be produced by ligating nucleotide 
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sequences (linkers) onto the DNA termini; these ligated linkers can contain 
specific chemically synthesized oligonucleotides encoding restriction endonuclease 
recognition sequences. In an alternative method, the cleaved vector and SP 
protein gene can be modified by homopolymeric tailing. Recombinant molecules 
5 can be introduced into host cells via, for example, transformation, transfection, 
infection, electroporation and sonorporation, so that many copies of the gene 

sequence are generated. 

In specific embodiments, transformation of host cells with recombinant 
DNA molecules that incorporate the isolated SP protein gene, cDNA, or 
10 synthesized DNA sequence enables generation of multiple copies of the gene. 
Thus, the gene can be obtained in large quantities by growing transformants, 
isolating the recombinant DNA molecules from the transformants and, when 
necessary, retrieving the inserted gene from the isolated recombinant DNA. 

E. Vectors, plasmids and cells that contain nucleic acids encoding a SP 
1 5 protein or protease domain thereof and expression of SP proteins 

Vectors and cells 

For recombinant expression of one or more of the SP proteins, the nucleic 
acid containing all or a portion of the nucleotide sequence encoding the SP protein 
can be inserted into an appropriate expression vector, i.e., a vector that contains 

20 the necessary elements for the transcription and translation of the inserted protein 
coding sequence. The necessary transcriptional and translational signals also can 
be supplied by the native promoter for SP genes, and/or their flanking regions. 

Also provided are vectors that contain nucleic acid encoding the SPs. Cells 
containing the vectors are also provided. The cells include eukaryotic and 

25 prokaryotic cells, and the vectors are any suitable for use therein. 

Prokaryotic and eukaryotic cells, including endothelial cells, containing the 
vectors are provided. Such cells include bacterial cells, yeast cells, fungal cells, 
Archea, plant cells, insect cells and animal cells. The cells are used to produce a 
SP protein or protease domain thereof by growing the above-described cells under 

30 conditions whereby the encoded SP protein or protease domain of the SP protein 
is expressed by the cell, and recovering the expressed protease domain protein. 
For purposes herein, the protease domain can be secreted into the medium. 
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ln one embodiment, the vectors include a sequence of nucleotides that 
encodes a polypeptide that has protease activity and contains all or a portion of 
the protease domain, or multiple copies thereof, of a SP protein are provided. 
Also provided are vectors that contain a sequence of nucleotides that encodes the 
5 protease domain and additional portions of a SP protein up to and including a full 
length SP protein, as well as multiple copies thereof, are also provided. The 
vectors can be selected for expression of the SP protein or protease domain 
thereof in the cell or such that the SP protein is expressed as a secreted protein. 
When the protease domain is expressed the nucleic acid is linked to nucleic acid 

10 encoding a secretion signal, such as the Saccharomyces cerevisiae a mating 
factor signal sequence or a portion thereof, or the native signal sequence. 

A variety of host-vector systems can be used to express the protein coding 
sequence. These include but are not limited to mammalian cell systems infected 
with virus (e.g. vaccinia virus, adenovirus and other viruses); insect cell systems 

15 infected with virus (e.g. baculovirus); microorganisms such as yeast containing 

yeast vectors; or bacteria transformed with bacteriophage, DNA, plasmid DNA, or 
cosmid DNA. The expression elements of vectors vary in their strengths and 
specificities. Depending on the host-vector system used, any one of a number of 
suitable transcription and translation elements can be used. 

20 Any methods known to those of skill in the art for the insertion of DNA 

fragments into a vector can be used to construct expression vectors containing a 
chimeric gene containing appropriate transcriptional/translational control signals 
and protein coding sequences. These methods can include in vitro recombinant 
DNA and synthetic techniques and in vivo recombinants (genetic recombination). 

25 Expression of nucleic acid sequences encoding SP protein, or domains, 

derivatives, fragments or homologs thereof, can be regulated by a second nucleic 
acid sequence so that the genes or fragments thereof are expressed in a host 
transformed with the recombinant DNA molecule(s). For example, expression of 
the proteins can be controlled by any promoter/enhancer known in the art. In a 

30 specific embodiment, the promoter is not native to the genes for SP protein. 
Promoters which can be used include but are not limited to the SV40 early 
promoter (Bernoist and Chambon, Nature 250:304-310 (1981)), the promoter 
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contained in the 3' long terminal repeat of Rous sarcoma virus (Yamamoto et al. 
Ceff 22:787-797 (1980)), the herpes thymidine kinase promoter (Wagner et al., 
Proc. Natl. Acad. Sci. USA 75:1441-1445 (1981)), the regulatory sequences of 
the metallothionein gene (Brinster et al., Nature 255:39-42 (1982)); prokaryotic 
5 expression vectors such as the ^-lactamase promoter (Villa-Kamaroff et al., Proc. 
Natl. Acad. Sci. USA 75:3727-3731 1978)) or the tac promoter (DeBoer et al., 
Proc. Natl. Acad. Scf. USA 50:21-25 (1983)); see also "Useful Proteins from 
Recombinant Bacteria": in Scientific American 242:79-94 (1980)); plant 
expression vectors containing the nopaline synthetase promoter (Herrar-Estrella et 

10 al., Nature 305:209-213 (1984)) or the cauliflower mosaic virus 35S RNA 

promoter (Garder et al., Nucleic Acids Res. 5:2871 (1981)), and the promoter of 
the photosynthetic enzyme ribuiose bisphosphate carboxylase (Herrera-Estrella et 
al., Nature 570:1 15-120 (1984)); promoter elements from yeast and other fungi 
such as the Gal4 promoter, the alcohol dehydrogenase promoter, the 

15 phosphoglycerol kinase promoter, the alkaline phosphatase promoter, and the 
following animal transcriptional control regions that exhibit tissue specificity and 
have been used in transgenic animals: elastase I gene control region which is 
active in pancreatic acinar cells (Swift et al., Cell 35:639-646 (1984); Omitz et 
al., Cold Spring Harbor Symp. Quant. Biol. 50:399-409 (1986); MacDonald, 

20 Hepatology 7:425-515 (1987)); insulin gene control region which is active in 
pancreatic beta cells (Hanahan et al., Nature 3/5:115-122 (1985)), 
immunoglobulin gene control region which is active in lymphoid cells (Grosschedl 
etal., Cell 35:647-658 (1984); Adams et al., Nature 375:533-538 (1985); 
Alexander etal., Mol. Cell Biol. 7:1436-1444 (1987)), mouse mammary tumor 

25 virus control region which is active in testicular, breast, lymphoid and mast cells 
{Leder etal., Cell 45:485-495 (1986)), albumin gene control region which is 
active in liver (Pinckert etal., Genes and Devel. 7:268-276 (1987)), alpha- 
fetoprotein gene control region which is active in liver (Krumlauf et al., Mol. Cell. 
Biol. 5:1639-1648 (1985); Hammer et al., Science 235:53-58 1987)), alpha-1 

30 antitrypsin gene control region which is active in liver (Kelsey et al., Genes and 
Devel. 7:161-171 (1987)), beta globin gene control region which is active in 
myeloid cells (Mogram etal., Nature 375:338-340 (1985); Kollias eta/., Cell 
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46:89-94 (1986)), myelin basic protein gene control region which is active in 
oligodendrocyte cells of the brain (Readhead et aL, Cell 48:703-71 2 (1987)), 
myosin light chain-2 gene control region which is active in skeletal muscle (Sani, 
Nature 3/4:283-286 (1985)), and gonadotropic releasing hormone gene control 
5 region which is active in gonadotrophs of the hypothalamus (Mason etal., 
Science 234:1372-1378 (1986)). 

In a specific embodiment, a vector is used that contains a promoter 
operably linked to nucleic acids encoding a SP protein, or a domain, fragment, 
derivative or homolog, thereof, one or more origins of replication, and optionally, 

10 one or more selectable markers (e.g., an antibiotic resistance gene). Expression 
vectors containing the coding sequences, or portions thereof, of a SP protein, are 
made, for example, by subcloning the coding portions into the EcoRI restriction 
site of each of the three pGEX vectors (glutathione S-transferase expression 
vectors (Smith and Johnson, Gene 7:31-40 (1988)). This allows for the 

1 5 expression of products in the correct reading frame. Vectors and systems for 
expression of the protease domains of the SP proteins include the well known 
Pichia vectors (available, for example, from Invitrogen, San Diego, CA), 
particularly those designed for secretion of the encoded proteins. One exemplary 
vector is described in the EXAMPLES. 

20 Plasmids for transformation of E. coif cells, include, for example, the pET 

expression vectors (see, U.S patent 4,952,496; available from NOVAGEN, 
Madison, Wl; see, also literature published by Novagen describing the system). 
Such plasmids include pET 1 1 a, which contains the T7lac promoter, T7 
terminator, the inducible E. coif lac operator, and the lac repressor gene; pET 1 2a- 

25 c, which contains the T7 promoter, T7 terminator, and the E. coli ompT secretion 
signal; and pET 15b and pET19b (NOVAGEN, Madison, Wl), which contain a His- 
Tag™ leader sequence for use in purification with a His column and a thrombin 
cleavage site that permits cleavage following purification over the column, the T7- 
lac promoter region and the T7 terminator. 

30 The vectors are introduced into host cells, such as Pichia cells and bacterial 

cells, such as E. coli, and the proteins expressed therein. Pichia strains, which are 
known and readily available, include, for example, GS1 15. Bacterial hosts can 
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contain chromosomal copies of DNA encoding T7 RNA polymerase operably linked 
to an inducible promoter, such as the lacUV promoter (see, U.S. Patent No. 
.4,952,496). Such hosts include, but are not limited to, the lysogenic E. co/i strain 
BL2KDE3). 

5 Expression and production of proteins 

The CVSP1 6 domains, derivatives and analogs can be produced by various 
methods known in the art. For example, once a recombinant cell expressing a SP 
protein, or a domain, fragment or derivative thereof, is identified, the individual 
gene product can be isolated and analyzed. This is achieved by assays based on 

10 the physical and/or functional properties of the protein, including, but not limited 
to, radioactive labeling of the product followed by analysis by gel electrophoresis, 
immunoassay, cross-linking to marker-labeled product. 

The CVSP1 6 polypeptides can be isolated and purified by standard 
methods known in the art (either from natural sources or recombinant host cells 

1 5 expressing the complexes or proteins), including but not restricted to column 

chromatography (e.g., ion exchange, affinity, gel exclusion, reversed-phase high 
pressure and fast protein liquid), differential centrifugation, differential solubility, 
or by any other standard technique used for the purification of proteins. 
Functional properties can be evaluated using any suitable assay known in the art. 

20 Once a SP protein or its domain or derivative is identified, the amino acid 

sequence of the protein can be deduced from the nucleotide sequence of the gene 
which encodes it. In addition, domains, analogs and derivatives of a SP protein 
can be chemically synthesized by standard chemical methods known in the art 
[e.g. see Hunkapiller et at. (1984) Nature 370:105-1 11). For example, a peptide 

25 corresponding to a portion of a SP protein, which includes the desired domain or 
which mediates the desired activity in vitro can be synthesized by use of a peptide 
synthesizer. Furthermore, if desired, nonclassical amino acids or chemical amino 
acid analogs can be introduced as a substitution or addition into the SP protein 
sequence. Non-classical amino acids include but are not limited to the D-isomers 

30 of the common amino acids, a-amino isobutyric acid, 4-aminobutyric acid, Abu, 

2-aminobutyric acid, e-Abu, e-Ahx, 6-amino hexanoic acid, Aib, 2-amino isobutyric 
acid, 3-amino propionoic acid, ornithine, norleucine, norvaline, hydroxyproline, 
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sarcosine, citrulline, cysteic acid, t-butylglycine, t-butylalanine, phenylglycine, 
cyclohexylalanine, ^-alanine, fluoro-amino acids, designer amino acids such as &- 
methyl amino acids, Ca-methyl amino acids, Na-methyl amino acids, and amino 
acid analogs in general. The amino acids can be D (dextrorotatory) or L 
5 (levorotatory). 

Manipulations of SP protein sequences can be made at the protein level. 
Also contemplated herein are SP proteins, domains thereof, derivatives or analogs 
or fragments thereof, which are differentially modified during or after translation, 
e.g., by glycosylation, acetylation, phosphorylation, amidation, derivatization by 

10 known protecting/blocking groups, proteolytic cleavage, linkage to an antibody 

molecule or other cellular Hgand. Any of numerous chemical modifications can be 
carried out by known techniques, including but not limited to specific chemical 
cleavage by cyanogen bromide, trypsin, chymotrypsin, papain, V8 protease, 
NaBH A , acetylation, formylation, oxidation, reduction and metabolic synthesis in 

1 5 the presence of tunicamycin. 

In cases where natural products are suspected of having a mutation or are 
isolated from new species, the amino acid sequence of the SP protein isolated 
from the natural source, as well as those expressed in vitro, or from synthesized 
expression vectors in vivo or in vitro, can be determined from analysis of the DNA 

20 sequence, or alternatively, by direct sequencing of the isolated protein. Such 

analysis can be performed by manual sequencing or through use of an automated 
amino acid sequenator. 

In particular, the protease domain (or full length or other fragment) of the 
CVSP1 6 can be expressed intracellular^ without a signal sequence, which results 

25 in accumulation or formation of inclusion bodies containing the protease domain. 
The inclusion bodies are isolated, denatured, solubilized and refolded to yield a 
protease domain (or full length or other fragment), which is then activated by 
cleavage at the activation cleavage site. 
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Modifications 

A variety of modifications of the SP proteins and domains are 
contemplated herein. A SP-encoding nucleic acid molecule can be modified by 
any of numerous strategies known in the art (Sambrook et al. (1989) Molecular 
5 Cloning, A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, and Sambrook et al. (2001) Molecular Cloning: A Laboratory 
Manual (3rd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, 
N.Y.). The sequences can be cleaved at appropriate sites with restriction 
endonuclease(s), followed by further enzymatic modification if desired, isolated, 

10 and ligated in vitro. In the production of the gene encoding a domain, derivative 
or analog of SP, care should be taken to ensure that the modified gene retains the 
original translational reading frame, uninterrupted by translational stop signals, in 
the gene region where the desired activity is encoded. 

Additionally, the SP-encoding nucleic acid molecules can be mutated in 

1 5 vitro or in vivo, to create and/or destroy translation, initiation, and/or termination 
sequences, or to create variations in coding regions and/or form new restriction 
endonuclease sites or destroy pre-existing ones, to facilitate further in vitro 
modification. Also, as described herein muteins with primary sequence 
alterations, such as replacements of Cys residues and elimination of glycosylation 

20 sites are contemplated. Such mutations can be effected by any technique for 

mutagenesis known in the art, including, but not limited to, chemical mutagenesis 
and in vitro site-directed mutagenesis (Hutchinson et al., J. Biol. Chem. 
253:6551-6558 (1978)), use of TAB® linkers (Pharmacia). In one embodiment, 
for example, a SP protein or domain thereof is modified to include a fluorescent 

25 label. In other specific embodiments, the SP protein is modified to have a 

heterofunctional reagent, which can be used to crosslink the members of the 
complex. 

F. Screening methods 

The single chain protease domains can be used in a variety of methods to 
30 identify compounds that modulate the activity thereof. For SPs that exhibit higher 
activity or expression in tumor cells, compounds that inhibit the proteolytic 
activity are of particular interest. For any SPs that are active at lower levels in 
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tumor cells, compounds or agents that enhance the activity are potentially of 
interest. In all instances the identified compounds include agents that are 
candidate cancer treatments. Several types of assays are described herein. 
It is understood that the protease domains can be used in other assays. A single- 
5 chain protease domain can exhibit catalytic activity. As such the protease 

domains are useful for in vitro screening assays, including, for example in binding 
assays. 

The CVSP16 full length zymogens, activated enzymes, single-, two- and 

three-chain protease domains are contemplated for use in any screening assay 

10 known to those of skill in the art, including those provided herein. Hence the 

following description, if directed to proteolytic assays is intended to apply to use 

of an active single-chain protease domain or a catalytically active portion thereof. 

Other assays, such as binding assays are provided herein, particularly for use with 

a CVSP1 6, including any variants, such as splice variants thereof. 

15 1 . Catalytic Assays for identification of agents that modulate the 

protease activity of a SP protein 

Methods for identifying a modulator of the catalytic activity of a SP, 

particularly a single chain protease domain or catalytically active portion thereof, 

are provided herein. The methods can be practiced by: a) contacting the 

20 CVSP16, a full-length zymogen or activated form, and particularly a single-chain 
domain thereof, with a substrate of the CVSP1 6 in the presence of a test 
substance, and detecting the proteolysis of the substrate, whereby the activity of 
the CVSP16 is assessed, and comparing the activity to a control. For example, 
the control can be the activity of the CVSP16 assessed by contacting a CVSP16, 

25 including a full-length zymogen or activated form, and particularly a single-chain 
domain thereof, with a substrate of the CVSP16, and detecting the proteolysis of 
the substrate, whereby the activity of the CVSP16 is assessed. The results in the 
presence and absence of the test compounds are compared. A difference in the 
activity indicates that the test substance modulates the activity of the CVSP16. 

30 Modulators, such as activators, of CVSP16 activity or activation also are 
contemplated; such assays are discussed below. 
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In one embodiment a plurality of the test substances are screened 
simultaneously in the above screening method. In another embodiment, the 
CVSP1 6 is isolated from a target cell as a means for then identifying agents that 
are potentially specific for the target cell. 
5 In another embodiment, a test substance is potentially a therapeutic 

compound. A difference in CVSP16 activity is measured in the presence and in 
the absence of the test substance. A difference. indicates that the target cell 
responds to the compound. 

One method includes the steps of (a) contacting the CVSP16 polypeptide 
10 or protease domain thereof with one or a plurality of test compounds under 

conditions conducive to interaction between the ligand and the compounds; and 
(b) identifying one or more compounds in the plurality that specifically binds to the 
ligand. 

Another method provided herein includes the steps of a) contacting a 

15 CVSP16 polypeptide or protease domain thereof with a substrate of the CVSP16 
polypeptide, and detecting the proteolysis of the substrate, whereby the activity 
of the CVSP16 polypeptide is assessed; b) contacting the CVSP16 polypeptide 
with a substrate of the CVSP16 polypeptide in the presence of a test substance, 
and detecting the proteolysis of the substrate, whereby the activity of the 

20 CVSP1 6 polypeptide is assessed; and c) comparing the activity of the CVSP1 6 
polypeptide assessed in steps a) and b), whereby the activity measured in step a) 
differs from the activity measured in step b) indicates that the test substance 
modulates the activity of the CVSP16 polypeptide. 

In another embodiment, a plurality of the test substances are screened 

25 simultaneously. In comparing the activity of a CVSP16 polypeptide in the 

presence and absence of a test substance to assess whether the test substance is 
a modulator of the CVSP16 polypeptide, it is unnecessary to assay the activity in 
parallel, although such parallel measurement is typical. It is possible to measure 
the activity of the CVSP16 polypeptide at one time point and compare the 

30 measured activity to a historical value of the activity of the CVSP16 polypeptide. 

For instance, one can measure the activity of the CVSP16 polypeptide in 
the presence of a test substance and compare with historical value of the activity 
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of the CVSP1 6 polypeptide measured previously in the absence of the test 
substance, and vice versa. This can be accomplished, for example, by providing 
the activity of the CVSP1 6 polypeptide on an insert or pamphlet provided with a 
kit for conducting the assay. Methods for selecting substrates for a particular SP 
5 are described in the EXAMPLES, and particular proteolytic assays are described. 

Combinations and kits containing the combinations optionally including 
instructions for performing the assays are provided. The combinations include a 
CVSP16 polypeptide and a substrate of the CVSP16 polypeptide to be assayed; 
and, optionally reagents for detecting proteolysis of the substrate. The 

10 substrates, which can be chromogenic or fluorogenic molecules, including 
proteins, subject to proteolysis by a particular CVSP1 6 polypeptide, can be 
identified empirically by testing the ability of the CVSP16 polypeptide to cleave 
the test substrate. Substrates that are cleaved most effectively (i.e., at the 
lowest concentrations and/or fastest rate or under desirable conditions), are 

15 identified. 

Additionally provided herein is a kit containing the above-described 
combination. The kit optionally includes instructions for identifying a modulator of 
the activity of a CVSP16 polypeptide. Any CVSP16 polypeptide is contemplated 
as target for identifying modulators of the activity thereof. 

20 2. Binding assays 

Also provided herein are methods for identification and isolation of agents, 
particularly compounds that bind to CVSP1 6s. The assays are designed to 
identify agents that bind to the zymogen form, the single chain isolated protease 
domain (or a protein, other than a CVSP16 polypeptide, that contains the protease 

25 domain of a CVSP16 polypeptide), and to the activated form, including the 

activated form derived from the full length zymogen or from a polypeptide that 
contains the protease domain. The identified compounds are candidates or leads 
for identification of compounds for treatments of tumors and other disorders and 
diseases involving aberrant proliferation and/or angiogenesis. The CVSP1 6 

30 polypeptides used in the methods include any CVSP16 polypeptide as defined 
herein, including the CVSP16 single chain protease domain or proteolytically 
active portion thereof. 
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A variety of methods are provided herein. These methods can be 
performed in solution or in solid phase reactions in which the CVSP16 
polypeptide(s) or protease domain(s) thereof are linked, either directly or indirectly 
via a linker, to a solid support. Screening assays are described in the Examples, 
5 and these assays can be used to identify candidate compounds. For purposes 
herein, all binding assays described above are provided for CVSP1 6. 

Methods for identifying an agent, such as a compound, that specifically 
binds, to a CVSP1 6 single chain protease domain, a zymogen or full-length 
activated CVSP16 and/or two-chain protease domain thereof or other polypeptides 

10 provided herein are provided. The method can be practiced by (a) contacting the 
CVSP16 with one or a plurality of test agents under conditions conducive to 
binding between the CVSP1 6 and an agent; and (b) identifying one or more 
agents within the plurality that specifically binds to one or more CVSP16 forms. 

For example, in practicing such methods the CVSP16 polypeptide is mixed 

1 5 with a potential binding partner or an extract or fraction of a cell under conditions 
that allow the association of potential binding partners with the polypeptide. 
After mixing, peptides, polypeptides, proteins or other molecules that have 
become associated with a CVSP16 are separated from the mixture. The binding 
partner that bound to the CVSP1 6 can then be removed and further analyzed. To 

20 identify and isolate a binding partner, the entire protein, for instance the entire 
disclosed protein of SEQ ID Nos. 6 can be used. Alternatively, a fragment of the 
protein can be used. 

A variety of methods can be used to obtain cell extracts or body fluids, 
such as blood, serum, urine, sweat, synovial fluid, CSF and other such fluids. For 

25 example, cells can be disrupted using either physical or chemical disruption 

methods. Examples of physical disruption methods include, but are not limited to, 
sonication and mechanical shearing. Examples of chemical lysis methods include, 
but are not limited to, detergent lysis and enzyme lysis. A skilled artisan can 
readily adapt methods for preparing cellular extracts in order to obtain extracts for 

30 use in the present methods. 

Once an extract of a cell is prepared, the extract is mixed with the CVSP1 6 
under conditions in which association of the protein with the binding partner can 
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occur. A variety of conditions can be used, including conditions that resemble 
conditions found in the cytoplasm of a human cell. Features such as osmolarity, 
pH, temperature, and the concentration of cellular extract used, can be varied to 
optimize the association of the protein with the binding partner. Similarly, 
5 methods for isolation of molecules of interest from body fluids are known. 

After mixing under appropriate conditions, the bound complex is separated 
from the mixture. A variety of techniques can be used to separate the mixture. 
For example, antibodies specific to a CVSP16 can be used to immunoprecipitate 
the binding partner complex. Alternatively, standard chemical separation 
10 techniques such as chromatography and density /sediment centrifugation can be 
used. 

After removing the non-associated cellular constituents in the extract, the 
binding partner can be dissociated from the complex using conventional methods. 
For example, dissociation can be accomplished by altering the salt concentration 
1 5 and/or pH of the mixture. 

To aid in separating associated binding partner pairs from the mixed 
. extract, the CVSP1 6 can be immobilized on a solid support. For example, the 
protein can be attached to a nitrocellulose matrix or acrylic beads. Attachment of 
the protein or a fragment thereof to a solid support aids in separating 
20 peptide/binding partner pairs from other constituents found in the extract. The 
identified binding partners can be either a single protein or a complex made up of 
two or more proteins. 

Alternatively, the nucleic acid molecules encoding the single chain 
proteases can be used in a yeast two-hybrid system. The yeast two-hybrid 
25 system has been used to identify other protein partner pairs and can readily be 
adapted to employ the nucleic acid molecules herein described. 

Another in vitro binding assay, particularly for a CVSP16, uses a mixture 
of a polypeptide that contains at least the catalytic domain of one of these 
proteins and one or more candidate binding targets or substrates. After 
30 incubating the mixture under appropriate conditions, the ability of the CVSP1 6 or 
a polypeptide fragment thereof containing the catalytic domain to bind to or 
interact with the candidate substrate is assessed. For ceil-free binding assays, 
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one of the components includes or is coupled to a detectable label. The label can 
provide for direct detection, such as radioactivity, luminescence, including 
fluorescence, optical or electron density, or indirect detection such as an epitope 
tag and an enzyme. A variety of methods can be employed to detect the label 
5 depending on the nature of the label and other assay components. For example, 
the label can be detected bound to the solid substrate or a portion of the bound 
complex containing the label can be separated from the solid substrate, and the 
label thereafter detected. 

3. Detection of signal transduction 

10 Secreted CVSPs, such as CVSP16, can be involved in signal transduction 

either directly by binding to or interacting with a cell surface receptor or indirectly 
by activating proteins, such as pro-growth factors that can initiate signal 
transduction. Assays for assessing signal transduction are well known to those of 
skill in the art, and can be adapted for use with the CVSP16 polypeptide. 

1 5 Assays for identifying agents that affect or alter signal transduction 

mediated directly or indirectly, such as via activation of a pro-growth factor, by a 
CVSP1 6, particularly the full length or a portion sufficient to exhibit proteolytic or 
binding activity. Such assays, include, for example, transcription based assays in 
which modulation of a transduced signal is assessed by detecting an effect on an 

20 expression from a reporter gene (see, e.g., U.S. Patent No. 5,436,128). 

4. Methods for Identifying Agents that Modulate the Expression a 
Nucleic Acid Encoding a CVSP1 6 

Another embodiment provides methods for identifying agents that 

modulate the expression of a nucleic acid encoding a CVSP16. Such assays use 

25 any available means of monitoring for changes in the expression level of the 
nucleic acids encoding a CVSP16. 

In one assay format, cell lines that contain reporter gene fusions between 
the open reading frame of CVSP1 6 or a domain thereof, particularly the protease 
domain and any assayable fusion partner can be prepared. Numerous assayable 

30 fusion partners are known and readily available including the firefly luciferase gene 
and the gene encoding chloramphenicol acetyltransferase {Alam et a!.. Anal. 
Biochem. 188: 245-54 (1990)). Cell lines containing the reporter gene fusions are 
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then exposed to the agent to be tested under appropriate conditions and time. 
Differential expression of the reporter gene between samples exposed to the 
agent and control samples identifies agents which modulate the expression of a 
nucleic acid encoding a CVSP16. 
5 Additional assay formats can be used to monitor the ability of the agent to 

modulate the expression of a nucleic acid encoding a CVSP16. For instance, 
mRNA expression can be monitored directly by hybridization to the nucleic acids. 
Cell lines are exposed to the agent to be tested under appropriate conditions and 
time and total RNA or mRNA is isolated by standard procedures (see, e.g., 

10 Sambrook et al. (2001) Molecular Cloning: A Laboratory Manual (3rd Ed.), Vol. 1- 
3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.). Probes to detect 
differences in RNA expression levels between cells exposed to the agent and 
control cells can be prepared from the nucleic acids. It is typical, but not 
necessary, to design probes which hybridize only with target nucleic acids under 

1 5 conditions of high stringency. Only highly complementary nucleic acid hybrids 
form under conditions of high stringency. Accordingly, the stringency of the 
assay conditions determines the amount of complementarity which should exist 
between two nucleic acid strands in order to form a hybrid. Stringency should be 
chosen to maximize the difference in stability between the probe.target hybrid and 

20 potential probe:non-target hybrids. 

Probes can be designed from the nucleic acids through methods known in 
the art. For instance, the G + C content of the probe and the probe length can 
affect probe binding to its target sequence. Methods to optimize probe specificity 
are commonly available (see, e.g., Sambrook et al. (2001) Molecular Cloning: A 

25 Laboratory Manual (3rd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring 
Harbor, N.Y.; Sambrook et al. (1989) MOLECULAR CLONING: A LABORATORY 
MANUAL, 2nd Ed. Cold Spring Harbor Laboratory Press); and Ausubel et al. 
(1995) CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, Greene Publishing Co., 
NY). 

30 Hybridization conditions are modified using known methods ((see, e.g., 

Sambrook et al. (2001) Molecular Cloning: A Laboratory Manual (3rd Ed.), Vol. 1- 
3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.; Sambrook et al. 
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(1989) MOLECULAR CLONING: A LABORATORY MANUAL, 2nd Ed. Cold Spring 
Harbor Laboratory Press); and Ausubel et al. (1995) CURRENT PROTOCOLS IN 
MOLECULAR BIOLOGY, Greene Publishing Co., NY), as required for each probe. 
Hybridization of total cellular RNA or RNA enriched for polyA RNA can be 
5 accomplished in any available format. For instance, total cellular RNA or RNA 
enriched for polyA RNA can be affixed to a solid support, and the solid support 
exposed to at least one probe comprising at least one, or part of one of the 
nucleic acid molecules under conditions in which the probe specifically hybridizes. 
Alternatively, nucleic acid fragments comprising at least one, or part of one of the 

10 sequences can be affixed to a solid support, such as a porous glass wafer. The 
glass wafer can then be exposed to total cellular RNA or polyA RNA from a 
sample under conditions in which the affixed sequences specifically hybridize. 
Such glass wafers and hybridization methods are widely available, for example, 
those disclosed by Beattie (WO 95/1 1755). By examining for the ability of a 

15 given probe to specifically hybridize to an RNA sample from an untreated cell 
population and from a cell population exposed to the agent, agents which up or 
down regulate the expression of a nucleic acid encoding the CVSP16 polypeptide, 
are identified. 

In one format, the relative amounts of a protein between a cell population 
20 that has been exposed to the agent to be tested compared to an un-exposed 
control cell population can be assayed {e.g., a prostate cancer cell line, a lung 
cancer cell line, a colon cancer cell line or a breast cancer cell line). In this 
format, probes, such as specific antibodies, are used to monitor the differential 
expression or level of activity of the protein in the different cell populations or 
25 body fluids. Cell lines or populations or body fluids are exposed to the agent to 
be tested under appropriate conditions and time. Cellular lysates or body fluids 
can be prepared from the exposed cell line or population and from a control, 
unexposed cell line or population or unexposed body fluid. The cellular lysates or 
body fluids are then analyzed with the probe. 
30 For example, N- and C- terminal fragments of the CVSP16 can be 

expressed in bacteria and used to search for proteins which bind to these 
fragments. Fusion proteins, such as His-tag or GST fusion to the N- or C-terminal 
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regions of the CVSP1 6 can be prepared for use as a binding partner. These fusion 
proteins can be coupled to, for example, Glutathione-Sepharose beads and then 
probed with cell lysates or body fluids. Prior to lysis, the cells or body fluids can 
be treated with a candidate agent which can modulate a CVSP1 6 or proteins that 
5 interact with domains thereon. Lysate proteins binding to the fusion proteins can 
be resolved by SDS-PAGE, isolated and identified by protein sequencing or mass 
spectroscopy, as is known in the art. 

Antibody probes are prepared by immunizing suitable mammalian hosts in 
appropriate immunization protocols using the peptides, polypeptides or proteins if 

10 they are of sufficient length (e.g., 4, 5, 6, 7, 8, 9, 10, 1 1 , 1 2, 1 3, 14, 1 5, 20, 
25, 30, 35, 40 or more consecutive amino acids of the CVSP16 polypeptide or if 
required to enhance immunogenicity, conjugated to suitable carriers. Methods for 
preparing immunogenic conjugates with carriers, such as bovine serum albumin 
(BSA), keyhole limpet hemocyanin (KLH), or other carrier proteins are well known 

15 in the art. In some circumstances, direct conjugation using, for example, 

carbodiimide reagents can be effective; in other instances linking reagents such as 
those supplied by Pierce Chemical Co., Rockford, IL, can be desirable to provide 
accessibility to the hapten. Hapten peptides can be extended at either the amino 
or carboxy terminus with a Cys residue or interspersed with cysteine residues, for 

20 example, to facilitate linking to a carrier. Administration of the immunogens is 
conducted generally by injection over a suitable time period and with use of 
suitable adjuvants, as is generally understood in the art. During the immunization 
schedule, titers of antibodies are taken to determine adequacy of antibody 
formation. 

25 Anti-peptide antibodies can be generated using synthetic peptides 

corresponding to, for example, the carboxy terminal amino acids of the CVSP16. 
Synthetic peptides can be as small as 1-3 amino acids in length, generally at least 
4 or more amino acid residues long. The peptides can be coupled to KLH using 
standard methods and can be immunized into animals, such as rabbits or 

30 ungulate. Polyclonal antibodies then can then be purified, for example using 
Actigel beads containing the covalently bound peptide, or other reagents for 
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affinity purification methods or by purifying all of the IgGs, using Protein A or 
Protein G columns or other such methods. 

While the polyclonal antisera produced in this way can be satisfactory for 
some applications. For pharmaceutical compositions monoclonal preparations are 
5 generally used. Immortalized cell lines which secrete the desired monoclonal 
antibodies can be prepared using the standard method of Kohler eta/., (Nature 
256: 495-7 (1975)} or modifications which effect immortalization of lymphocytes 
or spleen cells, as is generally known. The immortalized cell lines secreting the 
desired antibodies are screened by immunoassay in which the antigen is the 

10 peptide hapten, polypeptide or protein. When the appropriate immortalized cell 
culture secreting the desired antibody is identified, the cells can be cultured either 
in vitro or by production in vivo via ascites fluid. Of particular interest, are 
monoclonal antibodies that recognize the catalytic domain of a CVSP1 6. 

Additionally, the zymogen, a single chain or a three-chain or a two-chain 

15 form of the CVSP16 can be used to make monoclonal antibodies that recognize 
conformational epitopes. The desired monoclonal antibodies are then recovered 
from the culture supernatant or from the ascites supernatant. Fragments of the 
monoclonals or the polyclonal antisera that contain the antigen binding portion can 
be used as antagonists, as well as the intact antibodies. Immunologically reactive 

20 fragments, such as the Fab, Fab', or F(ab') 2 fragments are often used, especially 
in a therapeutic context, as these fragments are generally less immunogenic than 
the whole immunoglobulin. Regions that bind specifically to the desired regions of 
receptor also can be produced in the context of chimeras with multiple species 
origin. Fully human antibodies also can be prepared using, for example, either 

25 transgenic mice that contain human immunoglobulin genes or phage display 
libraries of human antibodies. 

G. Assay formats and selection of test substances that modulate at least one 
activity of a CVSP16 polypeptide 

Methods for identifying agents that modulate at least one activity of a 
30 CVSP1 6 are provided. Methods include phage display and include other methods 
for assessing alterations in the activity of an CVSP16. Such methods or assays 
can use any means of monitoring or detecting the desired activity. A variety of 
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formats and detection protocols are known for performing screening assays. Any 
such formats and protocols can be adapted for identifying modulators of CVSP1 6 
polypeptide activities. The following includes a discussion of exemplary protocols. 
1 . High throughput screening assays 
5 Although the above-described assay can be conducted where a single 

CVSP16 polypeptide is screened, and/or a single test substance is screened in 
one assay, the assay typically is conducted in a high throughput screening mode, 
i.e., a plurality of the SP proteins are screened against and/or a plurality of the 
test substances are screened simultaneously (See generally. High Throughput 

10 Screening: The Discovery of Bioactive Substances (Devlin, Ed.) Marcel Dekker, 
1997; Sittampalam et al., Curr. Opin. Chem. Biol., 7:384-91 (1997); and 
. Silverman et al., Curr. Opin. Chem. Biol. , 2:397-403 (1998)). For example, the 
assay can be conducted in a multi-well [e.g., 24-, 48-, 96-, 384-, 1536-well or 
higher density), chip or array format. 

1 5 High-throughput screening (HTS) is the process of testing a large number 

of diverse chemical structures against disease targets to identify "hits" 
(Sittampalam ef al., Curr. Opin. Chem. Biol., 7:384-91 (1997)). Current state-of- 
the-art HTS operations are highly automated and computerized to handle sample 
preparation, assay procedures and the subsequent processing of large volumes of 

20 data. 

Detection technologies employed in high-throughput screens depend on the 
type of biochemical pathway being investigated (Sittampalam et al., Curr. Opin. 
Chem. Biol., 7:384-91 (1997)). These methods include, radiochemical methods, 
such as the scintillation proximity assays (SPA), which can be adapted to a 

25 variety of enzyme assays (Lerner et al., J. Biomol. Screening, 7:135-143 (1996); 
Baker et al., Anal. Biochem., 239:20-24 (1996); Baum etal.,Anal. Biochem., 
237:129-134 (1996); and Sullivan eta/. t J. Biomol. Screening 2:19-23 (1997)) 
and protein-protein interaction assays (Braunwalder et al., J. Biomol. Screening 
7:23-26 (1996); Sonatore et at., Anal Biochem. 240:289-297 (1996); and Chen 

30 et al., J. Biol. Chem. 277:25308-25315 (1996)), and non-isotopic detection 

methods, including but are not limited to, colorimetric and luminescence detection 
methods, resonance energy transfer (RET) methods, time-resolved fluorescence 
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(HTRF) methods, cell-based fluorescence assays, such as fluorescence resonance 
energy transfer (FRET) procedures (see, e.g., Gonzalez et at., Biophys. J., 
55:1272-1280 (1995)), fluorescence polarization or anisotropy methods (see, 
e.g., Jameson eta/., Methods Enzymol. 246:283-300 (1995); Jolley, J. Biomol. 
5 Screening /:33-38 (1996); Lynch etal.,Anal. Biochem. 247:77-82 (1997)), 
fluorescence correlation spectroscopy (FCS) and other such methods. 

2. Test Substances 
Test compounds, including small molecules, antibodies, proteins, nucleic 
acids, peptides, natural products, mixtures of natural products, derivatives (e.g, 

10 chemical derivatives) of natural products, and libraries and collections thereof, can 
be screened in the above-described assays and assays described below to identify 
compounds that modulate the activity of a CVSP1 6 polypeptide. Rational drug 
design methodologies that rely on computational chemistry can be used to screen 
and identify candidate compounds. 

1 5 Test compounds (agents) that are assayed in the methods can be produced 

and obtained by any method known to those of skill in the art. For example, they 
can be randomly selected or rationally selected or designed. The agents can be, 
as examples, peptides, small molecules, and carbohydrates. A skilled artisan can 
readily recognize that there is no limit to the structural nature of the agents. The 

20 peptide agents can be prepared using standard solid phase (or solution phase) 

peptide synthesis methods, as is known in the art. In addition, the DNA encoding 
these peptides can be synthesized using commercially available oligonucleotide 
synthesis instrumentation and produced recombinantly using standard 
recombinant production systems. The production using solid phase peptide 

25 synthesis is necessitated if non-gene-encoded amino acids are to be included. 

The compounds identified by the screening methods include inhibitors, 
including antagonists, and agonists. Compounds for screening include any 
compounds and collections of compounds available, known or that can be 
prepared. 



WO 2004/005471 



PCT/US2003/020959 



-91- 

a. Selection of Compounds 

Compounds can be selected for their potency and selectivity of inhibition 
of serine proteases, especially a CVSP16 polypeptide. As described herein, and 
as generally known, a target serine protease and its substrate are combined under 
5 assay conditions permitting reaction of the protease with its substrate. The assay 
is performed in the absence of test compound, and in the presence of increasing 
concentrations of the test compound. The concentration of test compound at 
which 50% of the serine protease activity is inhibited by the test compound is the 
IC S0 value (Inhibitory Concentration) or EC 50 (Effective Concentration) value for 

10 that compound. Within a series or group of test compounds, those having lower 
IC 50 or EC 50 values are considered more potent inhibitors of the serine protease 
than those compounds having higher IC 50 or EC go values. The IC 50 measurement 
is often used for more simplistic assays, whereas the EC 50 is often used for more 
complicated assays, such as those employing cells. 

15 Typically candidate compounds have an IC 50 value of 100 nM or less as 

measured in an in vitro assay for inhibition of CVSP1 6 polypeptide activity. The 
test compounds also are evaluated for selectivity toward a serine protease. As 
described herein, and as generally known, a test compound is assayed for its 
potency toward a panel of serine proteases and other enzymes and an IC 50 value 

20 or EC S0 value is determined for each test compound in each assay system. A 

compound that demonstrates a low IC 50 value or EC 50 value for the target enzyme, 
e.g., CVSP16 polypeptide, and a higher IC 50 value or EC 50 value for other enzymes 
within the test panel (e.g., urokinase tissue plasminogen activator, thrombin, 
Factor Xa), is considered to be selective toward the target enzyme. Generally, a 

25 compound is deemed selective if its IC^ value or EC 50 value in the target enzyme 
assay is at least 2-fold, 5-fold, 10-fold (or higher-fold) less than the next smallest 
IC 50 value or EC 50 value measured in the selectivity panel of enzymes. 

Compounds are also evaluated for their activity in vivo. The type of assay 
chosen for evaluation of test compounds depends on the pathological condition to 

30 be treated or prevented by use of the compound, as well as the route of 
administration to be evaluated for the test compound. 
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For instance, to evaluate the activity of a compound to reduce tumor 
growth through inhibition of CVSP16 polypeptide, the procedures described by 
Jankun et al., Cane. Res. 57:559-563 (1997). Briefly, the ATCC cell lines DU145 
and LnCaP are injected into SCID mice. After tumors are established, the mice are 
5 given test compound according to a dosing regime determined from the 
compound's in vitro characteristics. The Jankun et al. compound was 
administered in water. Tumor volume measurements are taken twice a week for 
about five weeks. A compound is deemed active if an animal to which the 
compound was administered exhibited decreased tumor volume, as compared to 

10 animals receiving appropriate control compounds. 

Another in vivo experimental model designed to evaluate the effect of p- 
aminobenzamidine, a swine protease inhibitor, on reducing tumor volume is 
described by Billstrom et al., Int. J. Cancer 57:542-547 (1995). 

To evaluate the ability of a compound to reduce the occurrence of, or 

1 5 inhibit, metastasis, the procedures described by Kobayashi et al. Int. J. Cane. 
57:727-733d (1994) can be employed. Briefly, a murine xenograft selected for 
high lung colonization potential is injected into C57B1/6 mice i.v. (experimental 
metastasis) or s.c. into the abdominal wall (spontaneous metastasis). Various 
concentrations of the compound to be tested can be admixed with the tumor cells 

20 in Matrigel prior to subcutaneous injection. Daily i.p. injections of the test 

compound are made either on days 1-6 or days 7-13 after tumor inoculation. The 
animals are sacrificed about three or four weeks after tumor inoculation, and the 
lung tumor colonies are counted. Evaluation of the resulting data permits a 
determination as to efficacy of the test compound, optimal dosing and route of 

25 administration. 

The activity of the tested compounds toward decreasing tumor volume and 
metastasis can be evaluated in a model, such as that described in Rabbani et al., 
Int. J. Cancer 53:840-845 (1995) to evaluate its inhibitor. There, Mat LyLu tumor 
cells were injected into the flank of Copenhagen rats. The animals were 

30 implanted with osmotic minipumps to continuously administer various doses of 
test compound for up to three weeks. The tumor mass and volume of 
experimental and control animals were evaluated during the experiment, as were 
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metastatic growths. Evaluation of the resulting data permits a determination as to 
efficacy of the test compound, optimal dosing, and route of administration. Some 
of these authors described a related protocol in Xing et aL, Cane. Res. 57:3585- 
3593 (1997). 

5 To evaluate the anti-angiogenesis activity of a compound, a rabbit cornea 

neovascularization model can be employed (see, e.g., Avery eta/. (1 990) Arch. 
Ophthalmol. 705:1474-147). Avery et a/, describes anesthetizing New Zealand 
albino rabbits and then making a central corneal incision and forming a radial 
corneal pocket. A slow release prostaglandin pellet was placed in the pocket to 

10 induce neovascularization. Test compound was administered i.p. for five days, at 
which time the animals were sacrificed. The effect of the test compound is 
evaluated by review of periodic photographs taken of the limbus, which can be 
used to calculate the area of neovascular response and, therefore, limbal 
neovascularization. A decreased area of neovascularization as compared with 

1 5 appropriate controls indicates the test compound was effective at decreasing or 
inhibiting neovascularization. 

An angiogenesis model used to evaluate the effect of a test compound in 
preventing angiogenesis is described by Min eta/. Cane. Res. 55:2428-2433 
(1996). C57BL6 mice receive subcutaneous injections of a Matrigel mixture 

20 containing bFGF, as the angiogenesis-inducing agent, with and without the test 
compound. After five days, the animals are sacrificed and the Matrigel plugs, in 
which neovascularization can be visualized, are photographed. An experimental 
animal receiving Matrigel and an effective dose of test compound exhibits less 
vascularization than a control animal or an experimental animal receiving a less- or 

25 non-effective dose of compound. 

An in vivo system designed to test compounds for their ability to limit the 
spread of primary tumors is described by Crowley et a!., Proc. Nat/. Acad. Sci. 
50:5021-5025 (1993). Nude mice are injected with tumor cells (PC3) engineered 
to express CAT (chloramphenicol acetyltransf erase). Compounds to be tested for 

30 their ability to decrease tumor size and/or metastases are administered to the 

animals, and subsequent measurements of tumor size and/or metastatic growths 
are made. In addition, the level of CAT detected in various organs provides an 
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indication of the ability of the test compound to inhibit metastasis; detection of 
less CAT in tissues of a treated animal versus a control animal indicates less CAT- 
expressing cells migrated to that tissue. 

In vivo experimental models designed to evaluate the inhibitory potential of 
5 a test serine protease inhibitors, using a tumor cell line F3II known to be highly 
invasive (see, e.g., Alonso etal., Breast Cane. Res. Treat. 40:209-223 (1996)) 
can be used. Alonso describes in vivo studies for toxicity determination, tumor 
growth, invasiveness, spontaneous metastasis, experimental lung metastasis, and 

an angiogenesis assay. 

10 The CAM model (chick embryo chorioallantoic membrane model), first 

described by L. Ossowski in 1998 U Cell Biol. 707:2437-2445 (1988)), provides 
another method for evaluating the inhibitory activity of a test compound. In the 
CAM model, tumor cells invade through the chorioallantoic membrane. 
Administration of several serine protease inhibitors resulted in less or no invasion 

15 of the tumor cells through the membrane. Thus, the CAM assay is performed 

with CAM and tumor cells in the presence and absence of various concentrations 
of test compound. The invasiveness of tumor cells is measured under such 
conditions to provide an indication of the compound's inhibitory activity. A 
compound having inhibitory activity correlates with less tumor invasion. 

20 The CAM model also is used in a standard assay of angiogenesis (i.e., 

effect on formation of new blood vessels (see, e.g., Brooks etal. Methods in 
Molecular Biology 1 29:257-269 (1999)). In this model, a filter disc containing an 
angiogenesis inducer, such as basic fibroblast growth factor (bFGF) is placed onto 
the CAM. Diffusion of the cytokine into the CAM induces local angiogenesis, 

25 which can be measured In several ways such as by counting the number of blood 
vessel branch points within the CAM directly below the filter disc. The ability of 
identified compounds to inhibit cytokine-induced angiogenesis can be tested using 
this model. A test compound can either be added to the filter disc that contains 
the angiogenesis inducer, be placed directly on the membrane or be administered 

30 systemically. The extent of new blood vessel formation in the presence and/or 
absence of test compound can be compared using this model. The formation of 
fewer new blood vessels in the presence of a test compound would be indicative 
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of anti-angiogenesis activity. A demonstration of anti-angiogenesis activity for 
inhibitors of a CVSP16 polypeptide is indicative of a role in angiogenesis for that 
SP protein. 

b. Known serine protease inhibitors 
5 Compounds for screening can be serine protease inhibitors, which can be 

tested for their ability to inhibit the activity of a CVSP1 6. 

Exemplary serine protease inhibitors for use in the screening assays include, but 
are not limited to: Serine Protease Inhibitor 3 (SPI-3) (Chen, et al. Cytokine, 
7 7:856-862 (1999)); Aprotinin (lijima, R., etaL.J. Biochem. (Tokyo) 726:912- 
10 91 6 (1 999)); Kazal-type serine protease inhibitor-like proteins (Niimi, et al. Eur. J. 
Biochem., 265:282-292 (1999)); Kunitz-type serine protease inhibitor 
(Ravichandran, S., et al., Acta Crystallogr. D. Biol. Crystallogr. , 55:1814-1821 
(1999)); Tissue factor pathway inhibitor-2/Matrix-associated serine protease 
inhibitor (TFPI-2/MSPI), (Liu, Y. et al. Arch. Biochem. Biophys. 370:112-8 
15 (1999)); Bukunin (Cui, C.Y. et al. J. Invest. Dermatol. 7 73:182-8(1999)); 

Nafmostat mesilate (Ryo, R. et al. Vox Sang. 76:241-6 (1999)); TPCK (Huang et 
al. Oncogene 78:3431-3439 (1999)); A synthetic cotton-bound serine protease 
inhibitor (Edwards et al. Wdund Repair Regen. 7:106-18 (1999)); FUT-175 
(Sawada, M. et al. Stroke 30:644-50 (1999)); Combination of serine protease 
20 inhibitor FUT-0175 and thromboxane synthetase inhibitor OKY-046 (Kaminogo et 
al. Neurol. Med. Chir. (Tokyo) 36:704-8; discussion 708-9 (1998)); The rat serine 
protease inhibitor 2.1 gene (LeCam, A., et al., Biochem. Biophys. Res. Commun., 
253:31 1-4 (1998)); A new intracellular serine protease inhibitor expressed in the 
rat pituitary gland complexes with granzyme B (Hill et al. FEBS Lett. 
25 (1998)); 3,4-Dichloroisocoumarin (Hammed et al. Proc. Soc. Exp. Biol. Med., 

2 75:132-7 (1998)); LEX032 (Bains et al. Eur. J. Pharmacol. 356:67-72 (1998)); 
N-tosyl-L-phenylalanine chloromethyl ketone (Dryjanski et al. Biochemistry 
37:14151-6 (1998)); Mouse gene for the serine protease inhibitor neuroserpin 
(P1 12) (Berger et al. Gene, 274:25-33 (1998)); Rat serine protease inhibitor 2.3 
30 gene (Paul et al. Eur. J. Biochem. 254:538-46 (1998)); Ecotin (Yang et al. J. 
Mol. Biol. 275:945-57 (1998)); A 14 kDa plant-related serine protease inhibitor 
(Roch etal. Dev. Comp. Immunol. 22(1):1-12 (1998)); Matrix-associated serine 
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protease inhibitor TFPI-2/33 kDa MSPI {Rao et al. Int. J. Cancer 76:749-56 
(1998)); ONO-3403 {Hiwasa et al. Cancer Lett. 726:221-5 (1998)); Bdellastasin 
(Moser et al. Eur. J. Biochem. 253:212-20 (1998)); Bikunin (Xu et al. J. Mol. Biol. 
276:955-66 (1998)); Nafamostat mesilate (Mellgren et al. Thromb. Haemost. 
5 73:342-7 (1998)); The growth hormone dependent serine protease inhibitor, Spi 
2.1 (Maake et al. Endocrinology 736:5630-6 (1997)); Growth factor activator 
inhibitor type 2, a Kunitz-type serine protease inhibitor (Kawaguchi et al. J. Biol. 
Chem., 272:27558-64 (1997)); Heat-stable serine protease inhibitor protein from 
ovaries of the desert locust, Schistocerga gregaria (Hamdaoui et al. Biochem. 

10 Biophys. Res. Commun. 236:357-60 (1997)); Human placental Hepatocyte 
growth factor activator inhibitor, a Kunitz-type serine protease inhibitor 
(Shimomura et al. J. Biol. Chem. 272:6370-6 (1997)); FUT-187, oral serine 
protease inhibitor (Shiozaki et al. Gan To Kaguku Ryoho, 23(14): 1971-9 (1996)); 
Extracellular matrix-associated serine protease inhibitors {Mr 33,000, 31,000, and 

15 27,000 (Rao, C.N., et al., Arch. Biochem. Biophys. , 335:82-92 (1996)); An 
irreversible isocoumarin serine protease inhibitor (Palencia, D.D., et al., Biol. 
Reprod., 55:536-42 (1996)); 4-(2-aminoethyl)-benzenesulfonyl fluoride (AEBSF) 
(Nakabo et aL J. Leukoc. Biol. 60:328-36 (1996)); Neuroserpin (Osterwalder, T., 
et a[., EMBO J. 75:2944-53 (1996)); Human serine protease inhibitor alpha-1- 

20 antitrypsin (Forney et al. J. Parasitol.. 62:496-502 (1996)); Rat serine protease 
inhibitor 2.3 (Simar-Blanchet, A.E., etal., Eur. J. Biochem., 236:638-48 (1996)); 
Gebaxate mesilate (parodi, F., etal., J. Cardiothorac. Vase. Anesth. 70:235-7 
(1996)); Recombinant serine protease inhibitor, CPTI II (Stankiewicz, M., etal., 
{Acta Biochim. Pol. 43(3):525-9 (1996)); A cysteine-rich serine protease inhibitor 

25 (Guamerin II) (Kim, D.R., etal., J. Enzym. Inhib., 70:81-91 (1996)); 

diisopropylfluorophosphate {Lundqvist, H., etal., Inflamm. Res. 44(12)1510-7 
(1995)); Nexin 1 (Yu, D.W., etal., J. Cell Sci. 108(Pt 72^:3867-74 (1995)); 
LEX032 (Scalia, R., etal., Shock 4(4):251-6 (1995)); Protease nexin I (Houenou, 
L.J., etal., Proc. Natl. Acad. Sci. U.S.A. 92(3):BS5-9 (1995)); Chymase-directed 

30 serine protease inhibitor (Woodard S.L., etal., J. Immunol. 153(1 7J:501 6-25 

(1994)); N-alpha-tosyl-L-lysyl-chloromethyl ketone (TLCK) (Bourinbaiar, A.S., et 
al., Cell Immunol. 155(1):230-6 (1994)); Smpi56 (Ghendler, Y., etal., Exp. 
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ParasitoL 78(2):121-31 (1994)); Schistosoma haematobium serine protease 
inhibitor (Blanton, R.E., et af. f Mol. Biochem. ParasitoL 63(1):1-11 (1994)); Spi-1 
(Warren, W.C., et al., Mol. Cell Endocrinol 98(1):27-32 (1993)); TAME (Jessop, 
J.J., etal., Inflammation 7 7/^:613-31 (1993)); Antithrombin III (Kalaria, R.N., et 
5 aL, Am. J. Pathol. 143(3):886-33 (1993)); FOY-305 (Ohkoshi, M., etal., 

Anticancer Res. 13(4):963-6 (1993)); Camostat mesilate (Senda, S., et al., Intern. 
Med. 32(4):350-4 (1993)); Pigment epithelium-derived factor (Steele, F.R., etal., 
Proc. Natl. Acad. Sci. U.S.A. 90(4):1 526-30 (1993)); Antistasin (Holstein, T.W., 
et aL, FEBS Lett. 309(3):288-92 (1992)); the vaccinia virus K2L gene encodes a 
10 serine protease inhibitor (Zhou, J., etal., Virology 189(2):678-86 (1992)); 
Bowman-Birk serine-protease inhibitor (Werner, M.H., etal., J. Mol. Biol. 
225^:873-89 (1992); FUT-175 (Yanamoto, H., etal., Neurosurgery 30(3):358- 
63 (1992)); FUT-175; (Yanamoto, H., etal., Neurosurgery 30(3)1351-6, 
discussion 356-7 (1992)); PAI-I (Yreadwell, B.V., etaL, J. Orthop. Res. 9(3):30S- 
15 16 (1991)); 3,4-Dichloroisocoumarin (Rusbridge, N.M., etaL, FEBS Lett. 

268(11:133-6 (1990)); Alpha 1 -antic hymotrypsin (Lindmark, B.E., etaL, Am. Rev. 
Respir. Des. 141(4 Pt 7J:884-8 (1990)); P-toluenesulfonyl-L-arginine methyl ester 
(TAME) (Scuderi, P., J. Immunol., 143(11:168-73 (1989)); Alpha 1- 
antichymotrypsin (Abraham, C.R., etal. Cell 52(4) :A87 -501 (1988)); Contrapsin 
20 (Modha, J., etaL, Parasitology 96 (Pt 7^:99-109 (1988)); Alpha 2-antiplasmin 
(Holmes, W.E., etaL, J. Biol. Chem. 262(4):1 659-64 (1987)); 3,4- 
dichloroisocoumarin (Harper, J.W., et aL, Biochemistry 24^:1831-41 (1985)); 
Diisopropylfluorophosphate (Tsutsui, K., et aL, Biochem. Biophys. Res. Commun. 
123(1)\211-1 (1984)); Gabexate mesilate (Hesse, B., etal., Pharmacol. Res. 
25 Commun. 16(7)\63l-45 (1984)); Phenyl methyl sulfonyl fluoride (Dufer, J., etaL, 
Scand. J. Haematol. 32(1 7:25-32 (1984)); Protease inhibitor CI-2 (McPhalen, 
C.A., etaL, J. Mol. Biol. 168(2):AA5-7 (1983)); Phenylmethylsulfonyl fluoride 
(Sekar et al. , Biochem. Biophys. Res. Commun. , 89(2):A7A-8 (1979)); PGE1 
(Feinstein etaL, Prostaglandins 74/^:1075-93 (1977)). 



WO 2004/005471 



PCT/US2003/020959 



-98- 

c. Combinatorial libraries and other libraries 

The source of compounds for the screening assays, can be libraries, 
including, but are not limited to, combinatorial libraries. Methods for synthesizing 
combinatorial libraries and characteristics of such combinatorial libraries are 
5 known in the art (See generally, Combinatorial Libraries: Synthesis, Screening and 
Application Potential (Cortese Ed.) Walter de Gruyter, Inc., 1995; Tietze and Ueb, 
Curr. Opin. Chem. Biol. 2f3>/:363-71 (1998); Lam, Anticancer Drug Des. 
12(3):l45-67 (1997); Blaney and Martin, Curr. Opin. Chem. Biol. 1(1):54-9 
(1997); and Schultz and Schultz, Biotechnol. Prog. 72f5/:729-43 (1996)). 

1 0 Methods and strategies for generating diverse libraries, primarily peptide- 

and nucieotide-based oligomer libraries, have been developed using molecular 
biology methods and/or simultaneous chemical synthesis methodologies (see, e.g., 
Dower etal. f Annu. Rep. Med. Chem. 26:271-280 (1991); Fodor era/., Science 
257:767-773 (1991); Jung et aL , Ange w. Chem. Ind. Ed. Engl. 37:367-383 

1 5 (1 992); Zuckerman et aL t Proc. Natl. Acad. Sci. USA 55:4505-4509 (1 992); 
Scott era/., Science 245:386-390 (1990); Devlin era/., Science 245:404-406 
(1990); Cwirla era/., Proc. Natl. Acad. Sci. USA 57:6378-6382 (1990); and 
Gallop et at., J. Medicinal Chemistry 37'A 233-1 251 (1 994)). The resulting 
combinatorial libraries potentially contain millions of compounds that can be 

20 screened to identify compounds that exhibit a selected activity. 

The libraries fall into roughly three categories: fusion-protein-displayed 
peptide libraries in which random peptides or proteins are presented on the 
surface of phage particles or proteins expressed from plasmids; support-bound 
synthetic chemical libraries in which individual compounds or mixtures of 

25 compounds are presented on insoluble matrices, such as resin beads (see, e.g., 
Lam etal., Nature 354:82-84 (1991)) and cotton supports (see, e.g., Eichier et 
aL, Biochemistry 32:1 1035-1 1041 (1993)); and methods in which the compounds 
are used in solution (see, e.g., Houghten et aL, Nature 354:84-86 (1991); 
Houghten era/., BioTechniques 373:412-421 (1992); and Scott era/., Curr. Opin. 

30 Biotechnol. 5:40-48 (1994)). There are numerous examples of synthetic peptide 
and oligonucleotide combinatorial libraries and there are many methods for 
producing libraries that contain non-peptidic small organic molecules. Such 
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libraries can be based on a basis set of monomers that are combined to form 
mixtures of diverse organic molecules or that can be combined to form a library 
based upon a selected pharmacophore monomer. 

Either a random or a deterministic combinatorial library can be screened by 
5 the presently disclosed and/or claimed screening methods. In either of these two 
libraries, each unit of the library is isolated and/or immobilized on a solid support. 
In the deterministic library, one knows a priori a particular unit's location on each 
solid support. In a random library, the location of a particular unit is not known a 
priori although each site still contains a single unique unit. Many methods for 

10 preparing libraries are known to those of skill in this art (see, e.g., Geysen et aL, 
Proc. Natl. Acad. Sci. USA 57:3998-4002 (1984), Houghten etal., Proc. Natl. 
Acad. Sci. USA 57:5131-5135 (1985)). 

Combinatorial libraries generated by any techniques known to those of skill 
in the art are contemplated (see, e.g., Table 1 of Schultz and Schultz, Biotechnol. 

15 Prog. 12(6):72S-43 (1996)) for screening; Bartel et aL, Science 257:1411-1418 
(1993); Baumbach et aL BioPharm (May):24~35 (1992); Bock etal. Nature 
555:564-566 (1992); Borman, S., Combinatorial chemists focus on small 
molecules molecular recognition, and automation, Chem. Eng. News 2(1 2):2B 
(1996); Boublik, et aL, Eukaryotic Virus Display: Engineering the Major Surface 

20 Glycoproteins of the Autographa California Nuclear Polyhedrosis Virus (ACNPV) 
for the Presentation of Foreign Proteins on the Virus Surface, Bio/Technology 
73:1079-1084 (1995); Brenner, et aL, Encoded Combinatorial Chemistry, Proc. 
Natl. Acad ScL U.S.A. 55:5381-5383 (1992); Caflisch, et aL, Computational 
Combinatorial Chemistry for De Novo Ligand Design: Review and Assessment, 

25 Perspect. Drug Discovery Des. 3:51-84 (1995); Cheng, et aL, Sequence-Selective 
Peptide Binding with a Peptido-A,B-fra/7s-steroidal Receptor Selected from an 
Encoded Combinatorial Library, J- Am. Chem. Soc. 7 75:1813-1814 (1996); Chu, 
et aL, Affinity Capillary Electrophoresis to Identify the Peptide in A Peptide Library 
that Binds Most Tightly to Vancomycin, J. Org. Chem. 55:648-652 (1993); 

30 Clackson, et aL, Making Antibody Fragments Using Phage Display Libraries, 

Nature 352:624-628 (1991); Combs, et aL, Protein Structure-Based Combinatorial 
Chemistry: Discovery of Non-Peptide Binding Elements to Src SH3 Domain, J. 
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Am. Chem. Soc. 7 75:287-288 (1996); Cwirla, etal., Peptides On Phage: A Vast 
Library of Peptides for Identifying Ligands, Proc. Natl. Acad. Sci. U.S.A. 37:6378- 
6382 (1990); Ecker, etal., Combinatorial Drug Discovery: Which Method will 
Produce the Greatest Value, Bio/Technology 73:351-360 (1995); Ellington, etal., 
5 In Vitro Selection of RNA Molecules That Bind Specific Ligands, Nature, 346:818- 
822 (1990); Ellman, J.A., Variants of Benzodiazepines, J. Am. Chem. Soc, 
7 74:10997 (1992); Erickson, etal., The Proteins', Neurath, H., Hill, R.L., Eds.: 
Academic: New York, 1976; pp.. 255-257; Felici, etal., J. Mol. Biol. 222:301-310 
(1991); Fodor, etal., Light-Directed, Spatially Addressable Parallel Chemical 

10 Synthesis, Science 251:767-773 (1991); Francisco, etaL, Transport and 

Anchoring of Beta-Lactamase to the External Surface of E. ColL, Proc. Natl. Acad. 
Sci. U.S.A. 53:2713-2717 (1992); Georgiou, etal., Practical Applications of 
Engineering Gram-Negative Bacterial Cell Surfaces, TIBTECH 7 7:6-10 (1993); 
Geysen, et a/., Use of peptide synthesis to probe viral antigens for epitopes to a 

15 resolution of a single amino acid, Proc. Natl. Acad. Sci. U.S.A. 37:3998-4002 
(1984); Glaser, etal., Antibody Engineering by Condon-Based Mutagenesis in a 
Filamentous Phage Vector System, J. Immunol. 743:3903-3913 (1992); Gram, et 
al., In vitro selection and affinity maturation of antibodies from a naive 
combinatorial immunoglobulin library, Proc. Natl. Acad. Sci. 53:3576-3580 

20 (1992); Han, et al., Liquid-Phase Combinatorial Synthesis, Proc. Natl. Acad. Sci. 
U.S.A. 32:6419-6423 (1995); Hoogenboom, etal., Multi-Subunit Proteins on the 
Surface of Filamentous Phage: Methodologies for Displaying Antibody (Fab) Heavy 
and Light Chains, Nucleic Acids Res. 73:4133-4137 (1991); Houghten, et al., 
General Method for the Rapid Solid-Phase Synthesis of Large Numbers of 

25 Peptides: Specificity of Antigen-Antibody Interaction at the Level of Individual 

Amino Acids, Proc. Natl. Acad. Sci. U.S.A. 32:5131-5135 (1985); Houghten, et 
al., The Use of Synthetic Peptide Combinatorial Libraries for the Determination of 
Peptide Ligands in Radio-Receptor Assays-Opioid-Peptides, Bioorg. Med. Chem. 
Lett. 3:405-412 (1993); Houghten, etal.. Generation and Use of Synthetic 

30 Peptide Combinatorial Libraries for Basic Research and Drug Discovery, Nature 
354:84-86 (1991); Huang, etal., Discovery of New Ligand Binding Pathways in 
Myoglobin by Random Mutagenesis, Nature Struct. Biol. 7:226-229 (1994); Huse, 
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et al., Generation of a Large Combinatorial Library of the Immunoglobulin 
Repertoire In Phage Lambda, Science 246:1275-1281 (1989); Janda, K.D., New 
Strategies for the Design of Catalytic Antibodies, Biotechnol. Prog. 6:178-181 
(1990); Jung, et al., Multiple Peptide Synthesis Methods and Their Applications, 
5 Angew. Chem. int. Ed. Engi. 3 7:367-486 (1992); Kang, etal., Linkage of 

Recognition and Replication Functions By Assembling Combinatorial Antibody Fab 
Libraries Along Phage Surfaces, Proc. Nat/. Acad. Sci. U.S.A. 56:4363-4366 
(1991a); Kang, et al., Antibody Redesign by Chain Shuffling from Random 
Combinatorial Immunoglobulin Libraries, Proc. Natl. Acad. Sci. U.S.A., 66:11120- 

10 11123 (1991b); Kay, et al., An M13 Phage Library Displaying Random 38-Amino- 
Acid-Peptides as a Source of Novel Sequences with Affinity to Selected Targets 
Genes, Gene 726:59-65 (1993); Lam, et al., A new type of synthetic peptide 
library for identifying ligand-binding activity, Nature 654:82-84 (1991) (published 
errata in Nature 656:434 (1992) and Nature 360:768 (1992); Lebl, etaL, One 

15 Bead One Structure Combinatorial Libraries, Biopolymers (Pept. Sci.) 67:177-198 
(1995); Lerner, et al., Antibodies without Immunization, Science 256:1313-1314 
(1992); Li, et al., Minimization of a Polypeptide Hormone, Science 270:1657- 
1660 (1995); Light, et al., Display of Dimeric Bacterial Alkaline Phosphatase on 
the Major Coat Protein of Filamentous Bacteriophage, Bioorg. Med. Chem. Lett. 

20 6:1073-1079 (1992); Little, et al, Bacterial Surface Presentation of Proteins and 
Peptides: An Alternative to Phage Technology, Trends Biotechnol. 7 7:3-5 (1993); 
Marks, et al., By-Passing Immunization. Human Antibodies from V-Gene Libraries 
Displayed on Phage, J. Mol. Biol. , 222:581-597 (1991); Matthews, et al., 
Substrate Phage: Selection of Protease Substrates by Monovalent Phage Display, 

25 Science 260A 113-1117 (1 993); McCafferty, et al., Phage Enzymes: Expression 

and Affinity Chromatography of Functional Alkaline Phosphatase on the Surface of 
Bacteriophage, Protein Eng. 4:955-961 (1991); Menger, etal., Phosphatase 
Catalysis Developed Via Combinatorial Organic Chemistry, J. Org. Chem. 
60:6666-6667 (1995); Nicolaou, et al., Angew. Chem. Int. Ed. Engl. 64:2289- 

30 2291 (1995); Oldenburg, et al., Peptide Ligands for A Sugar-Binding Protein 

Isolated from a Random Peptide Library, Proc. Natl. Acad. Sci. U.S.A. 89:5393- 
5397 (1992); Parmley, etal., Antibody-Selectable Filamentous fd Phage Vectors: 
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Affinity Purification of Target Genes, Genes 73:305-318 (1988); Pinilia, et aL, 
Synthetic Peptide Combinatorial Libraries (SPCLS)-ldentification of the Antigenic 
Determinant of Beta-Endorphin Recognized by Monoclonal Antibody-3E7, Gene 
723:71-76 (1993); Pinilia, et aL, Review of the Utility of Soluble Combinatorial 
5 Libraries, Biopolymers 37:221-240 (1995); Pistor, etaL, Expression of Viral 

Hemaegglutinin On the Surface of E. ColL, Klin. Wochenschr. 65:1 10-1 1 6 (1 989); 
Pollack, etaL, Selective Chemical Catalysis by an Antibody, Science 234:1570- 
1572 (1986); Rigler, etaL, Fluorescence Correlations, Single Molecule Detection 
and Large Number Screening: Applications in Biotechnology, J. BiotechnoL 

10 47:177-186 (1995); Sarvetnick, et aL, Increasing the Chemical Potential of the 
Germ-Line Antibody Repertoire, Proc. NatL Acad. ScL U.S.A. 30:4008-401 1 
(1993); Sastry, etaL, Cloning of the Immunological Repertiore in Escherichia Cofi 
for Generation of Monoclonal Catalytic Antibodies: Construction of a Heavy Chain 
Variable Region-Specific cDNA Library, Proc. NatL Acad. ScL U.S.A. 33:5728- 

15 5732 (1989); Scott, etaL, Searching for Peptide Ligands with an Epitope Library, 
Science 243:386-390 (1990); Sears, etaL, Engineering Enzymes for Bioorganic 
Synthesis: Peptide Bond Formation, BiotechnoL Prog. 72:423-433 (1996); 
Simon, et. ah, Peptides: A Modular Approach to Drug Discovery, Proc. NatL 
Acad. ScL U.S.A. 33:9367-9371 (1992); Still, etaL, Discovery of Sequence- 

20 Selective Peptide Binding by Synthetic Receptors Using Encoded Combinatorial 
Libraries, Acc. Chem. Res. 23:155-163 (1996); Thompson, etaL, Synthesis and 
Applications of Small Molecule Libraries, Chem. Rev. 33:555-600 (1996); 
Tramontane etaL, Catalytic Antibodies, Science 234:1566-1570 (1986); 
Wrighton, etaL, Small Peptides as Potent Mimetics of the Protein Hormone 

25 Erythropoietin, Science 273:458-464 (1996); York, etaL, Combinatorial 

mutagenesis of the reactive site region in plasminogen activator inhibitor I, J. BioL 
Chem. 236:8595-8600 (1991); Zebedee, etaL, Human Combinatorial Antibody 
Libraries to Hepatitis B Surface Antigen, Proc. NatL Acad. ScL U.S.A. 33:3175- 
3179 (1992); Zuckerman, etaL, Identification of Highest- Affinity Ligands by 

30 Affinity Selection from Equimolar Peptide Mixtures Generated by Robotic 
Synthesis, Proc. NatL Acad. ScL U.S.A. 33:4505-4509 (1992). 
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For example, peptides that bind to a CVSP1 6 polypeptide or a protease 
domain of a SP protein can be identified using phage display libraries. In an 
exemplary embodiment, this method can include a) contacting phage from a 
phage library with the CVSP1 6 polypeptide or a protease domain thereof; (b) 
5 isolating phage that bind to the protein; and (c) determining the identity of at 

least one peptide coded by the isolated phage to identify a peptide that binds to a 
CVSP1 6 polypeptide. 

H. Modulators of the activity of CVSP16 polypeptides 

Provided herein are compounds, identified by screening or produced using 
10 the CVSP16 polypeptide or protease domain in other screening methods, that 

modulate the activity of a CVSP16. These compounds act by directly interacting 

with the CVSP16 polypeptide or by altering transcription or translation thereof. 

Such molecules include, but are not limited to, antibodies that specifically react 

with a CVSP1 6 polypeptide, particularly with the protease domain thereof, 
15 antisense nucleic acids or double-stranded RNA (dsRNA) such as RNAi, including 

those that contain modified nucleic acids, that alter expression of the CVSP16 

polypeptide, peptide mimetics and other such compounds. 
1 . Antibodies 

Antibodies, including polyclonal and monoclonal antibodies, that 
20 specifically bind to a CVSP16 polypeptide provided herein, including antibodies to 
single chain protease domains thereof, or to activated forms of a single-chain 
protease domain, or to full-length activaed or zymogen forms of the polypeptide or 
to other portions of a CVSP1 6 are provided. Generally, the antibody is a 
monoclonal antibody, and typically the antibody specifically binds to a protease 
25 domain of the CVSP1 6 polypeptide. 

Provided are antibodies that specifically bind to any domain of CVSP16, 
and antibodies that specifically bind to two-chain and/or three-chain thereof. 
Also provided are antibodies that specifically bind to an active site or active site 
cleft of zymogen and activated forms. Neutralizing antibodies are also provided. 
30 The CVSP1 6 polypeptide and domains, fragments, homologs and 

derivatives thereof can be used as immunogens to generate antibodies that 
specifically bind CVSP1 6 polypeptides and portions thereof. Such antibodies 
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include but are not limited to polyclonal, monoclonal, chimeric, single chain, Fab 
fragments, and a Fab expression library. In a specific embodiment, antibodies to 
human CVSP1 6 polypeptide are produced. In another embodiment, complexes 
formed from fragments of CVSP1 6 polypeptide, which fragments contain the 
5 serine protease domain, are used as immunogens for antibody production. 

Antibodies provided herein include, but are not limited to, monoclonal and 
polyclonal antibodies. They include antibodies that inhibits catalytic activity of a 
CVSP1 6 polypeptide provided herein and/or a ligand or substrate binding activity 
of the polypeptide. Also included are antibodies that specifically bind to a single- 

10 chain protease domain 1 (PD1) of a CVSP16 polypeptide and/or two-chain PD1 
and antibodies and antibodies that specifically bind to a single-chain protease 
domain 1 (PD2) of a CVSP16 polypeptide and/or two chain PD2. Included are 
antibodies that specifically that bind to a single-chain form and/or to two-chain 
and/or three-chain forms of a CVSP16 polypeptide. 

1 5 Various procedures known in the art can be used for the production of 

polyclonal antibodies to CVSP1 6 polypeptide, its domains, derivatives, fragments 
or analogs. For production of the antibody, various host animals can be 
immunized by injection with the native CVSP16 polypeptide or a synthetic 
version, or a derivative of the foregoing, such as a cross-linked CVSP1 6 

20 polypeptide. Such host animals include but are not limited to rabbits, mice, rats, 
chickens and other animals. Various adjuvants can be used to increase the 
immunological response, depending on the host species, and include but are not 
limited to Freund's (complete and incomplete), mineral gels such as aluminum 
hydroxide, surface active substances such as lysolecithin, pluronic polyols, 

25 polyanions, peptides, oil emulsions, dinitrophenol, and potentially useful human 
adjuvants such as bacille Calmette-Guerin (BCG) and corynebacterium parvum. 

For preparation of monoclonal antibodies directed towards a CVSP16 
polypeptide or domains, derivatives, fragments or analogs thereof, any technique 
that provides for the production of antibody molecules by continuous cell lines in 

30 culture can be used. Such techniques include but are not restricted to the 

hybridoma technique originally developed by Kohler and Milstein {Nature 256:495- 
497 (1975)), the trioma technique, the human B-cell hybridoma technique (Kozbor 
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etaL, Immunology Today 4:72 (1983)), and the EBV hybridoma technique to 
produce human monoclonal antibodies (Cole et aL, in Monoclonal Antibodies and 
Cancer Therapy, Alan R. Liss, Inc., pp. 77-96 (1985)). For example, immortalized 
cell lines that secrete the desired monoclonal antibodies are provided. The 
5 immortalized cell lines secreting the desired antibodies are screened by 

immunoassays in which the antigen is the peptide hapten, polypeptide or protein. 
When the appropriate immortalized cell culture secreting the desired antibody is 
identified, the cells can be cultured either in vitro or by production in vivo via 
ascites fluid. 

10 Monoclonal antibodies can be produced by other methods, such as in 

germ-free animals utilizing recent technology (PCT/US90/02545). Human 
antibodies can be used and can be obtained by using human hybridomas (Cote et 
aL, Proc. Natl. Acad. Sci. USA 50:2026-2030 (1983)), or by transforming human 
B cells with EBV virus in vitro (Cole et aL, in Monoclonal Antibodies and Cancer 

1 5 Therapy, Alan R. Liss, inc., pp. 77-96 (1985)). Techniques developed for the 
production of "chimeric antibodies" (Morrison et aL, Proc. Natl. Acad. ScL USA 
81:6851-6855 (1984); Neuberger et aL, Nature 3/2:604-608 (1984); Takeda et 
aL, Nature 3/4:452-454 (1985)) by splicing the genes from a mouse antibody 
molecule specific for the CVSP16 polypeptide together with genes from a human 

20 antibody molecule of appropriate biological activity can be used. 

Techniques described for the production of single chain antibodies (U.S. 
patent 4,946,778) can be adapted to produce CVSP16 polypeptide-specific single 
chain antibodies. An additional embodiment uses the techniques described for the 
construction of Fab expression libraries (Huse et aL, Science 246:1275-1281 

25 (1989)) to allow rapid and easy identification of monoclonal Fab fragments with 
the desired specificity for CVSP16 polypeptide or domains, derivatives, or analogs 
thereof. Non-human antibodies can be "humanized" by known methods (see, 
e.g., U.S. Patent No. 5,225,539). 

Antibody fragments that specifically bind to CVSP1 6 polypeptide or 

30 epitopes thereof can be generated by techniques known in the art. For example, 
such fragments include but are not limited to: the F(ab')2 fragment, which can be 
produced by pepsin digestion of the antibody molecule; the Fab' fragments that 
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can be generated by reducing the disulfide bridges of the F{ab')2 fragment; the 
Fab fragments that can be generated by treating the antibody molecule with 
papain and a reducing agent; and Fv fragments. 

In the production of antibodies, screening for the desired antibody can be 
5 accomplished by techniques known in the art, e.g., ELISA (enzyme-linked 

immunosorbent assay). To select antibodies specific to a particular domain of the 
CVSP1 6 polypeptide one can assay generated hybridomas for a product that binds 
to the fragment of the CVSP16 polypeptide that contains such a domain. 

The foregoing antibodies can be used in methods known in the art relating 
10 to the localization and/or quantitation of CVSP16 polypeptide proteins, e.g., for 
imaging these proteins, measuring levels thereof in appropriate physiological 
samples, in, for example, diagnostic methods. In another embodiment, anti- 
CVSP1 6 polypeptide antibodies, or fragments thereof, containing the binding 
domain are used as therapeutic agents. 
1 5 2. Peptides, Polypeptides and Peptide Mimetics 

Provided herein are methods for identifying molecules that bind to and 
modulate the activity of SP proteins. Included among molecules that bind to SPs, 
particularly the single chain protease domain or catalytically active fragments 
thereof, are peptides, polypeptides and peptide mimetics, including cyclic 
20 peptides. Peptide mimetics are molecules or compounds that mimic the necessary 
molecular conformation of a ligand or polypeptide for specific binding to a target 
molecule such as a CVSP1 6 polypeptide. In an exemplary embodiment, the 
peptides, polypeptides or peptide mimetics bind to the protease domain of the 
CVSP16 polypeptide. Such peptides and peptide mimetics include those of 
25 antibodies that specifically bind to a CVSP1 6 polypeptide and, typically, bind to 
the protease domain of a CVSP16 polypeptide. The peptides, polypeptides and 
peptide mimetics identified by methods provided herein can be agonists or 
antagonists of CVSP1 6 polypeptides. 

Such peptides and peptide mimetics are useful for diagnosing, treating, 
30 preventing, and screening for a disease or disorder associated with CVSP16 

polypeptide activity in a mammal. In addition, the peptides and peptide mimetics 
are useful for identifying, isolating, and purifying molecules or compounds that 
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modulate the activity of a CVSP16 polypeptide, or specifically bind to a CVSP16 
polypeptide, generally the protease domain of a CVSP16 polypeptide. Low 
molecular weight peptides and peptide mimetics can have strong binding 
properties to a target molecule, e.g., a CVSP16 polypeptide and/or the protease 
5 domain of a CVSP16 polypeptide. 

Peptides, polypeptides and peptide mimetics that bind to CVSP16 
polypeptides as described herein can be administered to mammals, including 
humans, to modulate CVSP16 polypeptide activity. Thus, methods for therapeutic 
treatment and prevention of neoplastic diseases include administering a peptide, 
10 polypeptides or peptide mimetic compound in an amount sufficient to modulate 
such activity are provided. Thus, also provided herein are methods for treating a 
subject having such a disease or disorder in which a peptide, polypeptides or 
peptide mimetic compound is administered to the subject in a therapeutically 

effective dose or amount. 

1 5 Compositions containing the peptides, polypeptides or peptide mimetics 

can be administered for prophylactic and/or therapeutic treatments. In therapeutic 
applications, compositions can be administered to a patient already suffering from 
a disease, as described above, in an amount sufficient to cure or at least partially 
arrest the symptoms of the disease and its complications. Amounts effective for 

20 this use will depend on the severity of the disease and the weight and general 
state of the patient and can be empirically determined. 

In prophylactic applications, compositions containing the peptides, 
polypeptides and peptide mimetics are administered to a patient susceptible to or 
otherwise at risk of a particular disease. Such an amount is defined to be a 

25 "prophylactically effective dose". In this use, the precise amounts again depend 
on the patient's state of health and weight. Accordingly, the peptides, 
polypeptides and peptide mimetics that bind to a CVSP16 polypeptide can be 
used to prepare pharmaceutical compositions containing, as an active ingredient, 
at least one of the peptides or peptide mimetics in association with a 

30 pharmaceutical carrier or diluent. The compounds can be administered, for 

example, by oral, pulmonary, parental (intramuscular, intraperitoneal, intravenous 
(IV) or subcutaneous injection), inhalation (via a fine powder formulation), 
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transdermal, nasal, vaginal, rectal, or sublingual routes of administration and can 
be formulated in dosage forms appropriate for each route of administration (see, 
e.g., International PCT application Nos. WO 93/25221 and WO 94/17784; and 
European Patent Application 613,683). 
5 Peptides, polypeptides and peptide mimetics that bind to CVSP16 

polypeptides are useful in vitro as tools for understanding the biological role of 
CVSP1 6 polypeptides, including the evaluation of the many factors thought to 
influence, and be influenced by, the production of CVSP16 polypeptide. Such 
peptides, polypeptides and peptide mimetics also are useful in the development of 

10 other compounds that bind to and modulate the activity of a CVSP16 polypeptide, 
because such compounds provide important information on the relationship 
between structure and activity that should facilitate such development. 

The peptides, polypeptides and peptide mimetics are also useful as 
competitive binders in assays to screen for new CVSP16 polypeptides or CVSP16 

1 5 polypeptide agonists. In such assay embodiments, the compounds can be used 
without modification or can be modified in a variety of ways; for example, by 
labeling, such as covalently or non-covaiently joining a moiety which directly or 
indirectly provides a detectable signal. In any of these assays, the materials 
thereto can be labeled either directly or indirectly. Possibilities for direct labeling 

20 include label groups such as: radiolabels such as 125 l enzymes (U.S. Pat. No. 

3,645,090) such as peroxidase and alkaline phosphatase, and fluorescent labels 
(U.S. Pat. No. 3,940,475) capable of monitoring the change in fluorescence 
intensity, wavelength shift, or fluorescence polarization. Possibilities for indirect 
labeling include biotinylation of one constituent followed by binding to avidin 

25 coupled to one of the above label groups. The compounds also can include 
spacers or linkers in cases where the compounds are to be attached to a solid 
support. 

Moreover, based on their ability to bind to a CVSP16 polypeptide, the 
peptides, polypeptides and peptide mimetics can be used as reagents for detecting 
30 CVSP16 polypeptides in living cells, fixed cells, in biological fluids, in tissue 
homogenates and in purified, natural biological materials. For example, by 
labeling such peptides, polypeptides and peptide mimetics, cells having CVSP1 6 
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polypeptides can be identified. In addition, based on their ability to bind a 
CVSP1 6 polypeptide, the peptides, polypeptides and peptide mimetics can be 
used in in situ staining, FACS (fluorescence-activated cell sorting), Western 
blotting, ELISA and other analytical protocols. Based on their ability to bind to a 
5 CVSP16 polypeptide, the peptides, polypeptides and peptide mimetics can be 
used in purification of CVSP16 polypeptides or in purifying cells expressing the 
CVSP16 polypeptide, e.g., a polypeptide encoding the protease domain of a 
CVSP16 polypeptide. 

The peptides, polypeptides and peptide mimetics also can be used as 

10 commercial reagents for various medical research and diagnostic uses. The 

activity of the peptides and peptide mimetics can be evaluated either in vitro or in 
vivo in one of the numerous models described in McDonald (1992) Am. J. of 
Pediatric Hematology/Oncology, 14\8-1^. 

3. Peptide, polypeptides and peptide mimetic therapy 

1 5 Peptide analogs are commonly used in the pharmaceutical industry as 

non-peptide drugs with properties analogous to those of the template peptide. 
These types of non-peptide compounds are termed "peptide mimetics" or 
"peptidomimetics" (Luthman et ai., A Textbook of Drug Design and Development, 
14:386-406, 2nd Ed., Harwood Academic Publishers (1996); Joachim Grante 

20 (1994) Angew. Chem. int. Ed. Engl., 33:1699-1720; Fauchere (1986) J. Adv. 
Drug Res., 75:29; Veber and Freidinger (1985) TINS, p. 392; and Evans et ai. 
(1987) J. Med. Chem. 30:1229). Peptide mimetics that are structurally similar to 
therapeutically useful peptides can be used to produce an equivalent or enhanced 
therapeutic or prophylactic effect. Preparation of peptidomimetics and structures 

25 thereof are known to those of skill in this art. 

Systematic substitution of one or more amino acids of a consensus 
sequence with a D-amino acid of the same type (e.g., D-lysine in place of L-lysine) 
can be used to generate more stable peptides. In addition, constrained peptides 
containing a consensus sequence or a substantially identical consensus sequence 

30 variation can be generated by methods known in the art (Rizo et ai. (1992) An. 
Rev. Biochem., 61:387, incorporated herein by reference); for example, by adding 
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internal cysteine residues capable of forming intramolecular disulfide bridges 
which cyclize the peptide. 

Those skilled in the art appreciate that modifications can be made to the 
peptides and mimetics without deleteriously effecting the biological or functional 
5 activity of the peptide. Further, the skilled artisan would know how to design 
non-peptide structures in three dimensional terms, that mimic the peptides that 
bind to a target molecule, e.g., a CVSP16 polypeptide or, generally, the protease 
domain of CVSP16 polypeptides (see, e.g., Eck and Sprang (1989) J. Bio!. Chem., 
26: 17605-18795). 

10 When used for diagnostic purposes, the peptides and peptide mimetics can 

be labeled with a detectable label and, accordingly, the peptides and peptide 
mimetics without such a label can serve as intermediates in the preparation of 
labeled peptides and peptide mimetics. Detectable labels can be molecules or 
compounds, which when covalently attached to the peptides and peptide 

15 mimetics, permit detection of the peptide and peptide mimetics in vivo, for 
example, in a patient to whom the peptide or peptide mimetic has been 
administered, or in vitro, e.g., in a sample or cells. Suitable detectable labels are 
well known in the art and include, by way of example, radioisotopes, fluorescent 
labels [e.g., fluorescein), and the like. The particular detectable label employed is 

20 not critical and is selected to be detectable at non-toxic levels. Selection of the 
such labels is well within the skill of the art. 

Covalent attachment of a detectable label to the peptide or peptide mimetic 
is accomplished by conventional methods well known in the art. For example, 
when the 125 l radioisotope is employed as the detectable label, covalent 

25 attachment of 125 l to the peptide or the peptide mimetic can be achieved by 

incorporating the amino acid tyrosine into the peptide or peptide mimetic and then 
iodinating the peptide (see, e.g., Weaner eta/. (1994) Synthesis and Applications 
of Isotopicaliy Labelled Compounds, pp. 137-140). If tyrosine is not present in 
the peptide or peptide mimetic, incorporation of tyrosine to the N or C terminus of 

30 the peptide or peptide mimetic can be achieved by well known chemistry. 
Likewise, 32 P can be incorporated onto the peptide or peptide mimetic as a 
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phosphate moiety through, for example, a hydroxyl group on the peptide or 
peptide mimetic using conventional chemistry. 

Labeling of peptidomimetics usually involves covalent attachment of one or 
more labels, directly or through a spacer (e.g., an amide group), to non-interfering 
5 position(s) on the peptidomimetic that are predicted by quantitative 

structure-activity data and/or molecular modeling. Such non-interfering positions 
generally are positions that do not form direct contacts with the 
macromolecules(s) to which the peptidomimetic binds to produce the therapeutic 
effect. Derivatization (e.g., labeling). of peptidomimetics should not substantially 
10 interfere with the desired biological or pharmacological activity of the 
peptidomimetic. 

Peptides, polypeptides and peptide mimetics that can bind to a CVSP1 6 
polypeptide or the protease domain of CVSP16 polypeptides and/or modulate the 
activity thereof, or exhibit CVSP16 protease activity, can be used for treatment of 

1 5 neoplastic disease. The peptides, polypeptides and peptide mimetics can be 
delivered, in vivo or ex vivo, to the cells of a subject in need of treatment. 
Further, peptides which have CVSP16 polypeptide activity can be delivered, in 
vivo or ex vivo, to cells which carry mutant or missing alleles encoding the 
CVSP1 6 polypeptide gene. Any of the techniques described herein or known to 

20 the skilled artisan can be used for preparation and in vivo or ex vivo delivery of 
such peptides, polypeptides and peptide mimetics that are substantially free of 
other human proteins. For example, the peptides, polypeptides and peptide 
mimetics can be readily prepared by expression in a microorganism or synthesis in 
vitro. 

25 The peptides or peptide mimetics can be introduced into cells, in vivo or ex 

vivo, by microinjection or by use of liposomes, for example. Alternatively, the 
peptides, polypeptides or peptide mimetics can be taken up by cells, in vivo or ex 
vivo, actively or by diffusion. In addition, extracellular application of the peptide, 
polypeptides or peptide mimetic can be sufficient to effect treatment of a 

30 neoplastic disease. Other molecules, such as drugs or organic compounds, that: 
1) bind to a CVSP16 polypeptide or protease domain thereof; or 2) have a similar 
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function or activity to a CVSP16 polypeptide or protease domain thereof, can be. 
used in methods for treatment. 

4. Rational drug design 

The goal of rational drug design is to produce structural analogs of 
5 biologically active polypeptides or peptides of interest or of small molecules or 

peptide mimetics with which they interact {e.^., agonists and antagonists) in order 
to fashion drugs which are, e.g., more active or stable forms thereof; or which, 
for example, enhance or interfere with the function of a polypeptide in vivo [e.g., 
a CVSP16 polypeptide). In one approach, one first determines the 

10 three-dimensional structure of a protein of interest {e.g., a CVSP16 polypeptide or 
polypeptide having a protease domain) or, for example, of a CVSP16 polypeptide- 
ligand complex, by X-ray crystallography, by computer modeling or most typically, 
by a combination of approaches. Also, useful information regarding the structure 
of a polypeptide can be gained by modeling based on the structure of homologous 

15 proteins. In addition, peptides can be analyzed by an alanine scan. In this 
technique, an amino acid residue is replaced by Ala, and its effect on the 
peptide's activity is determined. Each of the amino acid residues of the peptide is 
analyzed in this manner to determine the important regions of the peptide. 

Also, a polypeptide or peptide that binds to a CVSP16 polypeptide or, 

20 generally, the protease domain of a CVSP16 polypeptide, can be selected by a 
functional assay, and then the crystal structure of this polypeptide or peptide can 
be determined. The polypeptide can be, for example, an antibody specific for a 
CVSP16 polypeptide and/or the protease domain of a CVSP16 polypeptide. This 
approach can yield a pharmacophore upon which subsequent drug design can be 

25 based. Further, it is possible to bypass the crystallography altogether by 
generating anti-idiotypic polypeptides or peptides, (anti-ids) to a functional, 
pharmacologically active polypeptide or peptide that binds to a CVSP1 6 
polypeptide or protease domain of a CVSP16 polypeptide. As a mirror image of a 
mirror image, the binding site of the anti-ids is expected to be an analog of the 

30 original target molecule, e.g., a CVSP16 polypeptide or polypeptide having a 
CVSP16 polypeptide. The anti-id could then be used to identify and isolate 
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peptides from banks of chemically or biologically produced banks of peptides. 
Selected peptides would then act as the pharmacophore. 

Thus, one can design drugs which have, e.g., improved activity or stability 
or which act as modulators (e.g., inhibitors, agonists, antagonists) of CVSP16 
5 polypeptide activity, and are useful in the methods, particularly the methods for 
diagnosis, treatment, prevention, and screening of a neoplastic disease. By virtue 
of the availability of cloned CVSP16 polypeptide sequences/sufficient amounts of 
the CVSP1 6 polypeptide can be made available to perform such analytical studies 
as X-ray crystallography. In addition, the knowledge of the amino acid sequence 
10 of a CVSP16 polypeptide or the protease domain thereof, e.g., the protease 
domain encoded by the nucleotide sequence of SEQ ID No. 6, can provide 
guidance on computer modeling techniques in place of, or in addition to, X-ray 
crystallography. 

Methods of identifying peptides and peptide mimetics that bind to 
1 5 CVSP1 6 polypeptides 

Peptides having a binding affinity to the CVSP1 6 polypeptide are provided 

herein (e.g., a CVSP1 6 polypeptide or a polypeptide having a protease domain of 

a CVSP16 polypeptide) and can be readily identified, for example, by random 

peptide diversity generating systems coupled with an affinity enrichment process. 

20 Specifically, random peptide diversity generating systems include the "peptides on 
piasmids" system (see, e.g., U.S. Patent Nos. 5,270,170 and 5,338,665); the 
"peptides on phage" system (see, e.g., U.S. Patent No. 6,121,238 and Cwirla, et 
a/. (1990) Proc. Nati. Acad. ScL U.S.A. 57:6378-6382); the "polysome system;" 
the "encoded synthetic library (ESU tt system; and the "very large scale 

25 immobilized polymer synthesis" system (see, e.g., U.S. Patent No. 6,121,238; 
and Dower et al. (1 99 1) An. Rep. Med. Chem. 26:271-280). 

For example, using the procedures described above, random peptides can 
generally be designed to have a defined number of amino acid residues in length 
(e.g., 12). To generate the collection of oligonucleotides encoding the random 

30 peptides, the codon motif (NNK)x, where N is nucleotide A, C, G r or T (equimolar; 
depending on the methodology employed, other nucleotides can be employed), K 
is G or T (equimolar), and x is an integer corresponding to the number of amino 
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acids in the peptide {e.g., 1 2) and can be used to specify any one of the 32 
possible codons resulting from the NNK motif: 1 for each of 1 2 amino acids, 2 for 
each of 5 amino acids, 3 for each of 3 amino acids, and only one of the three stop 
codons. Thus, the NNK motif encodes all of the amino acids, encodes only one 
5 stop codon, and reduces codon bias. 

The random peptides can be presented, for example, either on the surface 
of a phage particle, as part of a fusion protein containing either the pill or the pVIII 
coat protein of a phage fd derivative (peptides on phage) or as a fusion protein 
with the Lacl peptide fusion protein bound to a plasmid (peptides on plasmids). 
10 The phage or plasmids, including the DNA encoding the peptides, can be identified 
and isolated by an affinity enrichment process using immobilized CVSP1 6 
polypeptide having a protease domain. The affinity enrichment process, 
sometimes called "panning," typically involves multiple rounds of incubating the 
phage, plasmids, or polysomes with the immobilized CVSP16 polypeptide, 
15 collecting the phage, plasmids, or polysomes that bind to the CVSP16 polypeptide 
(along with the accompanying DNA or mRNA), and producing more of the phage 
or plasmids (along with the accompanying Lacl-peptide fusion protein) collected. 

Characteristics of peptides and peptide mimetics 
Among the peptides, polypeptides and peptide mimetics for therapeutic 
20 application are those having molecular weights from about 250 to about 8,000 
daltons. If such peptides are oligomerized, dimerized and/or derivatized with a 
hydrophilic polymer (e.g., to increase the affinity and/or activity of the 
compounds), the molecular weights of such peptides can be substantially greater 
and can range anywhere from about 500 to about 120,000 daltons, generally 
25 from about 8,000 to about 80,000 daltons. Such peptides can contain 9 or more 
amino acids that are naturally occurring or synthetic (non-naturally occurring) 
amino acids. One skilled in the art can determine the affinity and molecular 
weight of the peptides and peptide mimetics suitable for therapeutic and/or 
diagnostic purposes (e.g., see Dower eta/., U.S. Patent No. 6,121,238). 
30 5. Methods of preparing peptides and peptide mimetics 

Peptides and peptide mimetics can be designed, using a variety of 
methods, such as, for example, the "encoded synthetic library" or "very large 
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scale immobilized polymer synthesis" systems (see, e.g., U.S. Patent Nos. 
5,925,525 and 5,902,723). Using the "encoded synthetic library" or "very large 
scale immobilized polymer synthesis" systems, the minimum size of a peptide 
with an activity of interest, such as binding to a CVSP1 6, can be determined. In 
5 addition all peptides that form the group of peptides that differ from the desired 
motif (or the minimum size of that motif) in one, two, or more residues can be 
prepared. This collection of peptides then can be screened for an ability to bind to 
a target molecule, e.g., a CVSP16 polypeptide or, generally, the protease domain 
of a CVSP16 polypeptide. This immobilized polymer synthesis system or other 

10 peptide synthesis methods also can be used to synthesize truncation analogs and 
deletion analogs and combinations of truncation and deletion analogs of the 
peptide compounds. 

Peptides that bind to CVSP16 polypeptides can be prepared by classical 
methods known in the art, for example, by using standard solid phase techniques. 

1 5 The standard methods include exclusive solid phase synthesis, partial solid phase 
synthesis methods, fragment condensation, classical solution synthesis, and even 
by recombinant DNA technology (see, e.g., Merrifield (1963)o/. Am. Chem. Soc, 
85:214-9, incorporated herein by reference.) 

These procedures also can be used to synthesize peptides in which amino 

20 acids other than the 20 naturally occurring, genetically encoded amino acids are 
substituted at one, two, or more positions of the peptide. For instance, 
naphthylalanine can be substituted for tryptophan, facilitating synthesis. Other 
synthetic amino acids that can be substituted into the peptides include 
L-hydroxy propyl, L-3, 4-dihydroxy-phenylaianyl, d amino acids such as 

25 L-d-hydroxylysyl and D-d-methylalanyl, L-a-methylalanyl, 0 amino acids, and 

isoquinolyl. D amino acids and non-naturally occurring synthetic amino acids also 
can be incorporated into the peptides (see, e.g., Roberts et aL (1983) Unusual 
Amino/Acids in Peptide Synthesis, 5(6):341 -449). 

The peptides also can be modified by phosphorylation (see, e.g., W. 

30 Bannwarth et aL (1 996) Biorganic and Medicinal Chemistry Letters, 

6(17):2141-2146), and other methods for making peptide derivatives (see, e.g., 
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Hruby et al. (1990) Biochem. J., 2SS(2):249-262), Thus, peptide compounds also 
serve as a basis to prepare peptide mimetics with similar biological activity. 

Those of skill in the art recognize that a variety of techniques are available 
for constructing peptide mimetics with the same or similar desired biological 
5 activity as the corresponding peptide compound but with more favorable activity 
than the peptide with respect to solubility, stability, and susceptibility to 
hydrolysis and proteolysis (see, e.g., Morgan et al. (1989) An. Rep. Med. Chem., 
24:243-252). Methods for preparing peptide mimetics modified at the N-terminal 
amino group, the C-termina! carboxyl group, and/or changing one or more of the 
10 amido linkages in the peptide to a non-amido linkage are known to those of skill in 
the art. 

Amino terminus modifications include, but are not limited to, alkylating, 
acetylating and adding a carbobenzoyl group, forming a succinimide group (see, 
e.g., Murray et al. (1 995) Burger's Medicinal Chemistry and Drug Discovery, 5th 

15 ed.. Vol. 1, Manfred E. Wolf, ed., John Wiley and Sons, Inc.). C-terminal 

modifications include mimetics wherein the C-terminal carboxyl group is replaced 
by an ester, an amide or modifications to form a cyclic peptide. 

In addition to N-terminal and C-terminal modifications, the peptide 
compounds, including peptide mimetics, advantageously can be modified with or 

20 covalently coupled to one or more of a variety of hydrophilic polymers. It has 
been found that when peptide compounds are derivatized with a hydrophilic 
polymer, their solubility and circulation half-lives can be increased and their 
immunogenicity is masked, with little, if any, diminishment in their binding 
activity. Suitable nonproteinaceous polymers include, but are not limited to, 

25 polyalkylethers as exemplified by polyethylene glycol and polypropylene glycol, 
polylactic acid, polyglycolic acid, polyoxyalkenes, polyvinylalcohol, 
polyvinylpyrrolidone, cellulose and cellulose derivatives, dextran and dextran 
derivatives. Generally, such hydrophilic polymers have an average molecular 
weight ranging from about 500 to about 100,000 daltons, including from about 

30 2,000 to about 40,000 daltons and, from about 5,000 to about 20,000 daltons. 
The hydrophilic polymers can have average molecular weights of about 5,000 
daltons, 10,000 daltons and 20,000 daltons. The peptide compounds can be 
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dimerized and each of the dimeric subunits can be covalently attached to a 
hydrophilic polymer. The peptide compounds can be PEGylated, i.e., covalently 
attached to polyethylene glycol (PEG). 

Methods for derivatizing peptide compounds or for coupling peptides to 
5 such polymers have been described (see, e.g., Zallipsky (1995) Bioconjugate 
Chem., 6:150-165; Monfardini et al. (1995) Bioconjugate Chem., 6:62-69; U.S. 
Pat. No. 4,640,835; U.S. Pat. No. 4,496,689; U.S. Pat. No. 4,301,144; U.S. Pat. 
No. 4,670,417; U.S. Pat. No. 4,791,192; U.S. Pat. No. 4,179,337 and WO 
95/34326, all of which are incorporated by reference in their entirety herein). 

10 Other methods for making peptide derivatives are described, for example, 

in Hruby et al. (1990), Biochem J., 26£(2):249-262, which is incorporated herein 
by reference. Thus, the peptide compounds also serve as structural models for 
non-peptidic compounds with similar biological activity. Those of skill in the art 
recognize that a variety of techniques are available for constructing compounds 

1 5 with the same or similar desired biological activity as a particular peptide 

compound but with more favorable activity with respect to solubility, stability, and 
susceptibility to hydrolysis and proteolysis (see, e.g., Morgan et al. (1989) An. 
Rep. Med. Chem., 24:243-252, incorporated herein by reference). These 
techniques include replacing the peptide backbone with a backbone composed of 

20 phosphonates, amidates, carbamates, sulfonamides, secondary amines, and 
N-methylamino acids. 

Peptide compounds can exist in a cyclized form with an intramolecular 
disulfide bond between the thiol groups of the cysteines. Alternatively, an 
intermolecular disulfide bond between the thiol groups of the cysteines can be 

25 produced to yield a dimeric (or higher oligomeric) compound. One or more of the 
cysteine residues also can be substituted with a homocysteine. 
I. Conjugates 

A conjugate, containing: a) a single chain protease domain (or 
proteolytically active portion thereof) of a C VSP1 6 polypeptide or a full length 

30 zymogen, activated form thereof, or two or single chain protease domain thereof; 
and b) a targeting agent linked to the CVSP16 polypeptide directly or via a linker, 
wherein the agent facilitates: i) affinity isolation or purification of the conjugate; ii) 
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attachment of the conjugate to a surface; iii) detection of the conjugate; or iv) 
targeted delivery to a selected tissue or cell, is provided herein. The conjugate can 
be a chemical conjugate or a fusion protein mixture thereof. 

The targeting agent can be a protein or peptide fragment, such as a tissue 
5 specific or tumor specific monoclonal antibody or growth factor or fragment 

thereof linked either directly or via a linker to a CVSP16 polypeptide or a protease 
domain thereof. The targeting agent also can be a protein or peptide fragment 
that contains a protein binding sequence, a nucleic acid binding sequence, a lipid 
binding sequence, a polysaccharide binding sequence, or a metal binding 

1 0 sequence, or a linker for attachment to a solid support. In a particular 

embodiment, the conjugate contains a) the CVSP1 6 or portion thereof, as 
described herein; and b) a targeting agent linked to the CVSP1 6 polypeptide 
directly or via a linker. 

Conjugates, such as fusion proteins and chemical conjugates, of the 

1 5 CVSP1 6 polypeptide with a protein or peptide fragment (or plurality thereof) that 
functions, for example, to facilitate affinity isolation or purification of the C VSP1 6 
polypeptide domain, attachment of the CVSP16 polypeptide domain to a surface, 
or detection of the CVSP16 polypeptide domain are provided. The conjugates can 
be produced by chemical conjugation, such as via thiol linkages, and can be 

20 produced by recombinant means as fusion proteins. In the fusion protein, the 

peptide or fragment thereof is linked to either the N-terminus or C-terminus of the 
C VSP1 6 polypeptide domain. In chemical conjugates the peptide or fragment 
thereof can be linked anywhere that conjugation can be effected, and there can be 
a plurality of such peptides or fragments linked to a single CVSP16 polypeptide 

25 domain or to a plurality thereof. 

The targeting agent is for in vitro or in vivo delivery to a cell or tissue, and 
includes agents such as cell or tissue-specific antibodies, growth factors and other 
factors that bind to moieties expressed on specific cells; and other cell or tissue 
specific agents that promote directed delivery of a linked protein. The targeting 
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agent can be one that specifically delivers the CVSP1 6 polypeptide to selected 
cells by interaction with a cell surface protein and internalization of conjugate or 
CVSP16 polypeptide portion thereof. 

These conjugates are used in a variety of methods and are particularly 
5 suited for use in methods of activation of prodrugs, such as prodrugs that upon 
cleavage by the particular CVSP1 6, which is localized at or near the targeted cell 
or tissue, are cytotoxic. The prodrugs are administered prior to, or simultaneously 
with, or subsequently to the conjugate. Upon delivery to the targeted cells, the 
protease activates the prodrug, which then exhibits a therapeutic effect, such as 
10 a cytotoxic effect. 

1 . Conjugation 

Conjugates with linked CVSP1 6 polypeptides and/or domains thereof can 
be prepared either by chemical conjugation, recombinant DNA technology, or by 
combinations of recombinant expression and chemical conjugation. The CVSP16 
1 5 polypeptide domains and the targeting agent can be linked in any orientation and 
more than one targeting agents and/or CVSP1 6 polypeptide domains can be 
present in a conjugate. 

a. Fusion proteins 

Fusion proteins are provided herein. A fusion protein contains: a) one or a 
20 plurality of domains of a C VSP1 6 polypeptide; and b) a targeting agent. The 

fusion proteins are generally produced by recombinant expression of nucleic acids 
that encode the fusion protein. 

b. Chemical conjugation 

To effect chemical conjugation herein, the CVSP1 6 polypeptide domain is 
25 linked via one or more selected linkers or directly to the targeting agent. 

Chemical conjugation must be used if the targeted agent is other than a peptide or 
protein, such as a nucleic acid or a non-peptide drug. Any means known to 
those of skill in the art for chemically conjugating selected moieties can be used. 
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2. Linkers 

Linkers for can be included in the conjugates. The conjugates can include 
one or more linkers between the CVSP1 6 polypeptide portion and the targeting 
agent. Additionally, linkers are used for facilitating or enhancing immobilization of 
5 a CVSP16 polypeptide or portion thereof on a solid support, such as a microtiter 
plate, silicon or silicon-coated chip, glass or plastic support, such as for high 
throughput solid phase screening protocols. Any linker known to those of skill in 
the art for preparation of conjugates can be used herein. These linkers are 
typically used in the preparation of chemical conjugates; peptide linkers can be 

10 incorporated into fusion proteins. 

Linkers can be any moiety suitable to associate a domain of CVSP1 6 
polypeptide and a targeting agent. Such linkers and linkages include, but are not 
limited to, peptidic linkages, amino acid and peptide linkages, typically containing 
between one and about 60 amino acids, more generally between about 10 and 30 

1 5 amino acids, chemical linkers, such as heterobifunctional cleavable cross-linkers, 
including but are not limited to, N-succinimidyl (4-iodoacety!)-aminobenzoate, 
sulfosuccinimidyl (4-iodoacetyl)-aminobenzoate, 4-succinimidyl-oxycarbonyl-a-(2- 
pyridyldithio)toluene, sulfosuccinimidyl-6-[ar-methyl-a-(pyridyldithiol)-toluamido] 
hexanoate, N-succinimidyl-3-(-2-pyridyldithio) - propionate, succinimidyl 6[3(-(-2- 

20 pyridyldithio)-propionamido] hexanoate, sulfosuccinimidyl 6[3{-{-2-pyridyldithio)- 
propionamido] hexanoate, 3-(2-pyridyldithio)-propionyl hydrazide, Ellman's 
reagent, dichlorotriazinic acid, and S-{2-thiopyridyl)-L-cysteine. Other linkers 
include, but are hot limited to peptides and other moieties that reduce steric 
hindrance between the domain of CVSP16 polypeptide and the targeting agent, 

25 intracellular enzyme substrates, linkers that increase the flexibility of the 
conjugate, linkers that increase the solubility of the conjugate, linkers that 
increase the serum stability of the conjugate, photocleavable linkers and acid 
cleavable linkers. 
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Other exemplary linkers and linkages that are suitable for chemically linked 
conjugates include, but are not limited to, disulfide bonds, thioether bonds, 
hindered disulfide bonds, and covalent bonds between free reactive groups, such 
as amine and thiol groups. These bonds are produced using heterobifunctional 
5 reagents to produce reactive thiol groups on one or both of the polypeptides and 
then reacting the thiol groups on one polypeptide with reactive thiol groups or 
amine groups to which reactive maleimido groups or thiol groups can be attached 
on the other. Other linkers include, acid cleavable linkers, such as 
bismaleimideothoxy propane, acid labile-transferrin conjugates and adipic acid 
10 diihydrazide, that would be cleaved in more acidic intracellular compartments; 
cross linkers that are cleaved upon exposure to UV or visible light and linkers, 
such as the various domains, such as C H 1 , C H 2, and C„3, from the constant 
region of human IgG, (see, Batra et al. Molecular Immunol. 30:379-386 (1993)). 
In some embodiments, several linkers can be included in order to take advantage 
15 of desired properties of each linker. 

Chemical linkers and peptide linkers can be inserted by covalently coupling 
the linker to the domain of CVSP1 6 polypeptide and the targeting agent. The 
heterobifunctional agents, described below, can be used to effect such covalent 
coupling. Peptide linkers also can be linked by expressing DNA encoding the 
20 linker and therapeutic agent (TA), linker and targeted agent, or linker, targeted 
agent and therapeutic agent (TA) as a fusion protein. Flexible linkers and linkers 
that increase solubility of the conjugates are contemplated for use, either alone or 
with other linkers are also contemplated herein. 

a) Acid cleavable, photocleavable and heat sensitive linkers 
25 Acid cleavable linkers, photocleavable and heat sensitive linkers also can 

be used, particularly where it can be necessary to cleave the domain of CVSP16 
polypeptide to permit it to be more readily accessible to reaction. Acid cleavable 
linkers include, but are not limited to, bismaleimideothoxy propane; and adipic acid 
dihydrazide linkers (see, e.g., Fattom et al. (1992) Infection & Immun. 60:584- 
30 589) and acid labile transferrin conjugates that contain a sufficient portion of 
transferrin to permit entry into the intracellular transferrin cycling pathway (see, 
e.g., Welhoner et al. (1991) J. Biol. Chem. 266:4309-4314). 
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Photocleavable linkers are linkers that are cleaved upon exposure to light 
(see, e.g., Goldmacher eta/. (1992) Bioconj. Chem. 5:104-107, which linkers are 
herein incorporated by reference), thereby releasing the targeted agent upon 
exposure to light. Photocleavable linkers that are cleaved upon exposure to light 
5 are known (see, e.g., Hazum et al. (1981) in Pept., Proc. Eur. Pept. Syrup. , 16th, 
Brunfeldt, K (Ed), pp. 105-110, which describes the use of a nitrobenzyl group as 
a photocleavable protective group for cysteine; Yen et al. (1989) Makromol. Chem 
730:69-82, which describes water soluble photocleavable copolymers, including 
hydroxypropylmethacrylamide copolymer, glycine copolymer, fluorescein 

10 copolymer and methylrhodamine copolymer; Goldmacheralso et al. (1992) Bioconj. 
Chem. 3:104-107, which describes a cross-linker and reagent that undergoes 
photolytic degradation upon exposure to near UV light (350 nm); and Senter et al. 
(1985) Photochem. Photoblol 42:231-237, which describes 
nitrobenzyloxycarbonyl chloride cross linking reagents that produce 

15 photocleavable linkages), thereby releasing the targeted agent upon exposure to 
light. Such linkers would have particular use in treating dermatological or 
ophthalmic conditions that can be exposed to light using fiber optics. After 
administration of the conjugate, the eye or skin or other body part can be exposed 
to light, resulting in release of the targeted moiety from the conjugate. Such 

20 photocleavable linkers are useful in connection with diagnostic protocols in which 
it is desirable to remove the targeting agent to permit rapid clearance from the 
body of the animal. 

b) Other linkers for chemical conjugation 
Other linkers, include trityl linkers, particularly, derivatized 

25 trityl groups to generate a genus of conjugates that provide for 

release of therapeutic agents at various degrees of acidity or alkalinity. 

The flexibility thus afforded by the ability to preselect the pH range at 

which the therapeutic agent is released allows selection of a linker based on the 

known physiological differences between tissues in need of delivery of a 

30 therapeutic agent (see, e.g., U.S. Patent No. 5,612,474). For example, the 
acidity of tumor tissues appears to be lower than that of normal tissues. 
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c) Peptide linkers 
The linker moieties can be peptides. Peptide linkers can be employed in 
fusion proteins and also in chemically linked conjugates. The peptide typically has 
from about 2 to about 60 amino acid residues, for example from about 5 to about 
5 40, or from about 10 to about 30 amino acid residues. The length selected 
depends upon factors, such as the use for which the linker is included. 

Peptide linkers are advantageous when the targeting agent is 
proteinaceous. For example, the linker moiety can be a flexible spacer amino acid 
sequence, such as those known in single-chain antibody research. Examples of 
10 such known linker moieties include, but are not limited to, peptides, such as 

(Gly m Ser) n and (Ser m Gly) n , in which n is 1 to 6, including 1 to 4 and 2 to 4, and m 
is 1 to 6, including 1 to 4, and 2 to 4, enzyme cleavable linkers and others. 

Additional linking moieties are known. See, for example, Huston et al., 
Proc. Natl. Acad. Sci. U.S.A. £5:5879-5883, 1988; Whitlow, M., et aL, Protein 
15 Engineering 5:989-995, 1993; Newton etal., Biochemistry 35:545-553, 1996; A. 
J. Cumber et al., Bioconj. Chem. 3:397-401, 1992; Ladurner et aL, J. MoL Biol. 
273:330-337 , 1997; and U.S. Patent No. 4,894,443. In some embodiments, 
several linkers can be included in order to take advantage of desired properties of 
each linker. 
20 3. Targeting agents 

Any agent that facilitates detection, immobilization, or purification of the 
conjugate is contemplated for use herein. For chemical conjugates any moiety 
that has such properties is contemplated; for fusion proteins, the targeting agent 
is a protein, peptide or fragment thereof that is sufficient to effect the targeting 
25 activity. Contemplated targeting agents include those that deliver the CVSP16 
polypeptide or portion thereof to selected cells and tissues. Such agents include 
tumor specific monoclonal antibodies and portions thereof, growth factors, such 
as FGF, EGF, PDGF, VEGF, cytokines, including chemokines, and other such 
agents. 
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4. Nucleic acids, plasmids and cells 

Isolated nucleic acid fragments encoding fusion proteins are provided. The 
nucleic acid fragment that encodes the fusion protein includes: a) nucleic acid 
encoding a protease domain of a CVSP16 polypeptide; and b) nucleic acid 
5 encoding a protein, peptide or effective fragment thereof that facilitates: i) affinity 
isolation or purification of the fusion protein; ii) attachment of the fusion protein to 
a surface; or iii) detection of the fusion protein. Generally, the nucleic acid is 
DNA. 

Plasmids for replication and vectors for expression that contain the above 
10 nucleic acid fragments are also provided. Cells containing the plasmids and 

vectors are also provided. The cells can be any suitable host including, but are 
not limited to, bacterial cells, yeast cells, fungal cells, plant cells, insect cell and 
animal cells. The nucleic acids, plasmids, and cells containing the plasmids can 
be prepared according to methods known in the art including any described 
1 5 herein. 

Also provided are methods for producing the above fusion proteins. An 
exemplary method includes the steps of growing cells {i.e., culturing the cells so 
that they proliferate) containing a plasmid encoding the fusion protein under 
conditions whereby the fusion protein is expressed by the cell, and recovering the 

20 expressed fusion protein. Methods for expressing and recovering recombinant 
proteins are well known in the art {See generally. Current Protocols in Molecular 
Biology (1998) § 16, John Wiley & Sons, Inc.) and such methods can be used for 
expressing and recovering the expressed fusion proteins. 

The recovered fusion proteins can be isolated or purified by methods 

25 known in the art such as, for example, centrifugation, filtration, chromatography, 
electrophoresis and immunoprecipitation, or by a combination thereof {See 
generally. Current Protocols in Molecular Biology (1998) § 10, John Wiley &. 
Sons, Inc.). Generally the recovered fusion protein is isolated or purified through 
affinity binding between the protein or peptide fragment of the fusion protein and 

30 an affinity binding moiety. As discussed in the above sections regarding the 

construction of the fusion proteins, any affinity binding pairs can be constructed 
and used in the isolation or purification of the fusion proteins. For example, the 
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affinity binding pairs can be protein binding sequences/protein, DNA binding 
sequences/DNA sequences, RNA binding sequences/RNA sequences, lipid binding 
sequences/Iipid, polysaccharide binding sequences/polysaccharide, or metal 
binding sequences/metal. 
5 5. Immobilization and supports or substrates therefor 

In certain embodiments, where the targeting agents are designed for 
linkage to surfaces, the CVSP1 6 polypeptide can be attached by linkage such as 
ionic or covalent, non-covalent or other chemical interaction, to a surface of a 
support or matrix material. Immobilization can be effected directly or via a linker. 

10 The CVSP16 polypeptide can be immobilized on any suitable support, including, 
but are not limited to, silicon chips, and other supports described herein and 
known to those of skill in the art. A plurality of CVSP16 polypeptide or protease 
domains thereof can be attached to a support, such as an array {i.e., a pattern of 
two or more) of conjugates on the surface of a silicon chip or other chip for use in 

15 high throughput protocols and formats. 

It also is noted that the domains of the CVSP16 polypeptide can be linked 
directly to the surface or via a linker without a targeting agent linked thereto. 
Hence chips containing arrays of the domains of the CVSP1 6 polypeptide are also 
provided. 

20 The matrix material or solid supports contemplated herein are generally any 

of the insoluble materials known to those of skill in the art to immobilize ligands 
and other molecules, and are those that are used in many chemical syntheses and 
separations. Such supports are used, for example, in affinity chromatography, in 
the immobilization of biologically active materials, and during chemical syntheses 

25 of biomolecules, including proteins, amino acids and other organic molecules and 
polymers. The preparation of and use of supports is well known to those of skill 
in this art; there are many such materials and preparations thereof known. For 
example, naturally-occurring support materials, such as agarose and cellulose, can 
be isolated from their respective sources, and processed according to known 

30 protocols, and synthetic materials can be prepared in accord with known protocols. 
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The supports are typically insoluble materials that are solid, porous, 
deformable, or hard, and have any required structure and geometry, including, but 
not limited to: beads, pellets, disks, capillaries, hollow fibers, needles, solid fibers, 
random shapes, thin films and membranes. Thus, the item can be fabricated from 
5 the matrix material or combined with it, such as by coating all or part of the 
surface or impregnating particles. 

Typically, when the matrix is particulate, the particles are at least about 
1 0-2000 //m, but can be smaller or larger, depending upon the selected 
application. Selection of the matrices is governed, at least in part, by their 

10 physical and chemical properties, such as solubility, functional groups, mechanical 
stability, surface area swelling propensity, hydrophobic or hydrophilic properties 
and intended use. 

If necessary, the support matrix material can be treated to contain an 
appropriate reactive moiety. In some cases, the support matrix material already 

1 5 containing the reactive moiety can be obtained commercially. The support matrix 
material containing the reactive moiety can thereby serve as the matrix support 
upon which molecules are linked. Materials containing reactive surface moieties 
such as amino silane linkages, hydroxyl linkages or carboxysilane linkages can be 
produced by well established surface chemistry techniques involving silanization 

20 reactions, or the like. Examples of these materials are those having surface silicon 
oxide moieties, covalently linked to gamma-aminopropylsilane, and other organic - 
moieties; N-[3-(triethyoxysilyl)propyl]phthalamic acid; and bis-(2-hydroxyethyl)- 
aminopropyltriethoxysilane. Exemplary of readily available materials containing 
amino group reactive functionalities, include, but are not limited to, para-amino- 

25 phenyltriethyoxysilane. Also derivatized polystyrenes and other such polymers are 
well known and readily available to those of skill in this art {e.g., the Tentagel® 
Resins are available with a multitude of functional groups, and are sold by Rapp 
Polymere, Tubingen, Germany; see, U.S. Patent No. 4,908,405 and U.S. Patent 
No. 5,292,814; see, also Butz eta/. Peptide Res. 7:20-23 (1994); and Klein et al. 

30 Immunobiol. 750:53-66(1994)). 

These matrix materials include any material that can act as a support 
matrix for attachment of the molecules of interest. Such materials are known to 
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those of skill in this art, and include those that are used as a support matrix. 
These materials include, but are not limited to, inorganics, natural polymers, and 
synthetic polymers, including, but are not limited to: cellulose, cellulose 
derivatives, acrylic resins, glass, silica gels, polystyrene, gelatin, polyvinyl 
5 pyrrolidone, co-polymers of vinyl and acrylamide, polystyrene cross-linked with 
divinylbenzene and others (see, Merrifield, Biochemistry 3:1385-1390 (1964)), 
polyacrylamides, latex gels, polystyrene, dextran, polyacrylamides, rubber, silicon, 
plastics, nitrocellulose, celluloses, natural sponges. Of particular interest herein, 
are highly porous glasses [see, e.g., U.S. Patent No. 4,244,721) and others 

1 0 prepared by mixing a borosilicate, alcohol and water. 

Synthetic supports include, but are not limited to: acrylamides, dextran- 
derivatives and dextran co-polymers, agarose-polyacrylamide blends, other 
polymers and co-polymers with various functional groups, methacrylate 
derivatives and co-polymers, polystyrene and polystyrene copolymers (see, e.g., 

15 Merrifield, Biochemistry 3:1385-1390 (1964); Berg et ai. Innovation Perspect. 

Solid Phase Synth. Collect. Pap., Int. Symp., 1st, Epton, Roger (Ed), pp. 453-459 
(1990); Berg etal. t Pept., Proc. Eur. Pept. Symp., 20th, Jung, G. et aL (Eds), pp. 
1 96-1 98 (1 989); Berg et aL, J. Am. Chem. Soc. 1 1 /:8024-8026 (1 989); Kent et 
aL, Isr. J. Chem., 17:243-247 (1979); Kent et aL., J. Org. Chem. 43:2845-2852 

20 (1978); Mitchell etal., Tetrahedron Lett. 42:3795-3798 (1976); U.S. Patent No. 
4,507,230; U.S. Patent No. 4,006,117; and U.S. Patent No. 5,389,449). Such 
materials include those made from polymers and co-polymers such as 
polyvinylalcohols, acrylates and acrylic acids such as polyethylene-co-acrylic acid, 
polyethylene-co-methacrylic acid, polyethylene-co-ethylacrylate, 

25 polyethylene-co-methyl acrylate, polypropylene-co-acrylic acid, 

polypropylene-co-methyl-acrylic acid, polypropylene-co-ethylacrylate, 
polypropylene-co-methyl acrylate, polyethylene-co-vinyl acetate, poly- 
propylene-co-vinyl acetate, and those containing acid anhydride groups such as 
polyethylene-co-maleic anhydride and polypropylene-co-maleic anhydride. 

30 Liposomes have also been used as solid supports for affinity purifications (Powell 
etal. Biotechnol. Bioeng. 33:173 (1989)). 



WO 2004/005471 



PCT/US2003/020959 



-128- 

Numerous methods have been developed for the immobilization of proteins 
and other biomolecules onto solid or liquid supports [see, e.g., Mosbach, Methods 
in Enzymology 44 (1976); Weetall, Immobilized Enzymes, Antigens, Antibodies, 
and Peptides, (1 975); Kennedy et aL, Solid Phase Biochemistry, Analytical and 
5 Synthetic Aspects, Scouten, ed., pp. 253-391 (1983); see, generally, Affinity 
Techniques. Enzyme Purification: Part B. Methods in Enzymology, Vol. 34, ed. W. 
B. Jakoby, M. Wilchek, Acad. Press, N.Y. (1974); and Immobilized Biochemicals 
and Affinity Chromatography, Advances in Experimental Medicine and Biology, 
vol. 42, ed. R. Dunlap, Plenum Press, N.Y. (1974)). 

10 Among the most commonly used methods are absorption and adsorption or 

covalent binding to the support, either directly or via a linker, such as the 
numerous disulfide linkages, thioether bonds, hindered disulfide bonds, and 
covalent bonds between free reactive groups, such as amine and thiol groups, 
known to those of skill in art (see, e.g., the PIERCE CATALOG, 

15 ImmunoTechnology Catalog & Handbook, 1992-1993, which describes the 
preparation of and use of such reagents and provides a commercial source for 
such reagents; Wong, Chemistry of Protein Conjugation and Cross Unking, CRC 
Press (1 993); see also DeWitt et aL, Proc. Natl. Acad. Sci. U.S.A. 50:6909 
(1993); Zuckermann et al., J. Am. Chem. Soc. 114: 10646 (1992); Kurth et aL, J. 

20 Am. Chem. Soc. 1 76:2661 (1994); Ellman et aL, Proc. Natl. Acad. ScL U.S.A. 

5 7:4708 (1994); Sucholeiki, Tetrahedron Lttrs. 35:7307 (1994); Su-Sun Wang, J. 
Org. Chem. 47:3258 (1976); Padwa et aL, J. Org. Chem. 47:3550 (1971); and 
Vedejs et aL, J. Org. Chem. 49:575 (1984), which describe photosensitive 
linkers). 

25 To effect immobilization, a cpmposition containing the protein or other 

biomolecule is contacted with a. support material such as alumina, carbon, an 
ion-exchange resin, cellulose, glass or a ceramic. Fluorocarbon polymers have 
been used as supports to which biomolecules have been attached by adsorption 
{see, U.S. Patent No. 3,843,443; Published International PCT Application WO/86 

30 03840). 
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J. Prognosis and diagnosis 

CVSP16 polypeptide proteins, domains, analogs, and derivatives thereof, 
and encoding nucleic acids (and sequences complementary thereto), and anti- 
CVSP1 6 polypeptide antibodies, can be used in diagnostics, particularly diagnosis 
5 of cervical cancer, colon and pancreatic cancers, and possibly other cancers, 
including prostate, colon, ovary, cervix and breast cancers. Such molecules can 
be used in assays, such as immunoassays, to detect, prognose, diagnose, or 
monitor various conditions, diseases, and disorders affecting CVSP16 polypeptide 
expression, or monitor the treatment thereof. For purposes herein, the presence 

10 of CVSP1 6s in body fluids or tumor tissues are of particular interest. 

In particular, such an immunoassay is carried out by a method including 
contacting a sample derived from a patient with an anti-CVSP16 polypeptide 
antibody under conditions such that specific binding can occur, and detecting or 
measuring the amount of any specific binding by the antibody. Such binding of 

15 antibody, in tissue sections, can be used to detect aberrant CVSP16 polypeptide 
localization or aberrant (e.g., increased, decreased or absent) levels of CVSP16 
polypeptide or aberrant activity if CVSP16 or aberrant processing of CVSP16. For 
example, antibody to CVSP16 polypeptide can be used to assay in a patient tissue 
or serum sample for the presence of CVSP1 6 polypeptide where an aberrant level 

20 of CVSP1 6 polypeptide is an indication of a diseased condition. 

The immunoassays which can be used include but are not limited to 
competitive and non-competitive assay systems using techniques such as western 
blots, radioimmunoassays, ELISA (enzyme linked immunosorbent assay), 
"sandwich" immunoassays, immunoprecipitation assays, precipitin reactions, gel 

25 diffusion precipitin reactions, immunodiffusion assays, agglutination assays, 
complement-fixation assays, immunoradiometric assays, fluorescent 
immunoassays and proteinalso isA immunoassays. 

CVSP1 6 polypeptide genes and related nucleic acid sequences and 
subsequences, including complementary sequences, also can be used in 

30 hybridization assays. CVSP1 6 polypeptide nucleic acid sequences, or 

subsequences thereof containing about at least 8 nucleotides, generally 14 or 16 
or 30 or more, generally less than 1000 or up to 100, contiguous nucleotides can 
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be used as hybridization probes. Hybridization assays can be used to detect, 
prognose, diagnose, or monitor conditions, disorders, or disease states associated 
with aberrant changes in CVSP1 6 polypeptide expression and/or activity as 
described herein. In particular, such a hybridization assay is carried out by a 
5 method by contacting a sample containing nucleic acid with a nucleic acid probe 
capable of hybridizing to CVSP16 polypeptide encoding DNA or RIMA, under 
conditions such that hybridization can occur, and detecting or measuring any 
resulting hybridization. 

In a specific embodiment, a method of diagnosing a disease or disorder 

1 0 characterized by detecting an aberrant level of a CVSP1 6 polypeptide in a subject 
is provided herein by measuring the level of the DNA, RNA, protein or activity, 
such as protease and/or binding activity, of a CVSP16 polypeptide in a sample 
derived from the subject. An increase or decrease in the level of the DNA, RNA, 
protein or functional activity of the CVSP16 polypeptide, relative to the level of 

1 5 the DNA, RNA, protein or functional activity found in an analogous sample not 
having the disease or disorder indicates the presence of the disease or disorder in 
the subject. 

Kits for diagnostic use are also provided, that contain in one or more 
containers an anti-CVSP16 polypeptide antibody, and, optionally, a labeled binding 

20 partner to the antibody. Alternatively, the anti-CVSP16 polypeptide antibody can 
be labeled (with a detectable marker, e.g., a chemiluminescent, enzymatic, 
fluorescent, or radioactive moiety). A kit also is provided that includes in one or 
more containers a nucleic acid probe capable of hybridizing to SP protein-encoding 
RNA. In a specific embodiment, a kit can contain in one or more containers a pair 

25 of primers [e.g., each in the size range of 6-30 nucleotides) that are capable of 
priming amplification, e.g., by polymerase chain reaction (see e.g., Innis et at., 
1990, PCR Protocols, Academic Press, Inc., San Diego, CA), ligase chain reaction 
(see EP 320,308), use of Ofi replicase, cyclic probe reaction, or other methods 
known in the art under appropriate reaction conditions of at least a portion of a SP 

30 protein-encoding nucleic acid. A kit can optionally further comprise in a container 
a predetermined amount of a purified GVSP16 polypeptide or nucleic acid, e.g., 
for use as a standard or control. 
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K. Pharmaceutical compositions and modes of administration 
1 . Components of the compositions 

Pharmaceutical compositions containing the identified compounds that 
modulate the activity of a CVSP16 polypeptide are provided herein. Also provided 
5 are combinations of a compound that modulates an activity of a CVSP1 6 

polypeptide and another treatment or compound for treatment of a neoplastic 
disorder, such as a chemotherapeutic compound. The CVSP16 polypeptide 

modulator and the anti-tumor agent can be packaged as separate compositions for 
administration together or sequentially or intermittently. Alternatively, they can 
10 provided as 

a single composition for administration or as two compositions for administration 
as a single composition. The combinations can be packaged as kits. 

a. CVSP1 6 polypeptide inhibitors 

Any C VSP1 6 polypeptide inhibitors, including those described herein when 
1 5 used alone or in combination with other compounds, that can alleviate, reduce, 
ameliorate, prevent, or place or maintain in a state of remission of clinical 
symptoms or diagnostic markers associated with neoplastic diseases, including 
undesired and/or uncontrolled angiogenesis, can be used in the present 
combinations. 

20 For example, the CVSP16 polypeptide inhibitor is an antibody or fragment 

thereof that specifically reacts with a CVSP1 6 polypeptide or the protease domain 
thereof or other region thereof, such as the activation region, or is an inhibitor of 
the CVSP1 6 polypeptide production, an inhibitor of CVSP1 6 polypeptide 
membrane-localization or an inhibitor of the expression or activation of a CVSP1 6 

25 polypeptide. 

b. Anti-angiogenic agents and anti-tumor agents 

Any anti-angiogenic agents and anti-tumor agents, including those 
described herein, when used alone or in combination with other compounds, that 
can alleviate, reduce, ameliorate, prevent, or place or maintain in a state of 
30 remission of clinical symptoms or diagnostic markers associated with undesired 

and/or uncontrolled angiogenesis and/or tumor growth and metastasis, particularly 
solid neoplasms, vascular malformations and cardiovascular disorders, chronic 
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inflammatory diseases and aberrant wound repairs, circulatory disorders, crest 
syndromes, dermatological disorders, or ocular disorders, can be used in the 
combinations. Also contemplated are anti-tumor agents for use in combination 
with an inhibitor of a CVSP16 polypeptide. 
5 c. Anti-tumor agents and anti-angiogenic agents 

The compounds identified by the methods provided herein or provided 
herein can be used in combination with anti-tumor agents and/or anti-angiogenesis 
agents. 

2. Formulations and route of administration 

1 0 The compounds herein and agents can be formulated as pharmaceutical 

compositions, typically for single dosage administration. The concentrations of 
the compounds in the formulations are effective for delivery of an amount, upon 
administration, that is effective for the intended treatment. Typically, the 
compositions are formulated for single dosage administration. To formulate a 

15 composition, the weight fraction of a compound or mixture thereof is dissolved, 
suspended, dispersed or otherwise mixed in a selected vehicle at an effective 
concentration such that the treated condition is relieved or ameliorated. 
Pharmaceutical carriers or vehicles suitable for administration of the compounds 
provided herein include any such carriers known to those skilled in the art to be 

20 suitable for the particular mode of administration. 

In addition, the compounds can be formulated as the sole pharmaceutical^ 
active ingredient in the composition or can be combined with other active 
ingredients. Liposomal suspensions, including tissue-targeted liposomes, also can 
be suitable as pharmaceutical^ acceptable carriers. These can be prepared 

25 according to methods known to those skilled in the art. For example, liposome 
formulations can be prepared as described in U.S. Patent No. 4,522,811. 

The active compound is included in the pharmaceutical^ acceptable carrier 
in an amount sufficient to exert a therapeutically useful effect in the absence of 
undesirable side effects on the patient treated. The therapeutically effective 

30 concentration can be determined empirically by testing the compounds in known 
in vivo and in vivo systems, such as the assays provided herein. 
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The concentration of active compound in the drug composition depends on 
absorption, inactivation and excretion rates of the active compound, the 
physicochemical characteristics of the compound, the dosage schedule, and 
amount administered as well as other factors known to those of skill in the art. 
5 Typically a therapeutically effective dosage is contemplated. The amounts 

administered can be on the order of 0.001 to 1 mg/ml, including about 0.005- 
0.05 mg/ml and about 0.01 mg/ml, of blood volume. Pharmaceutical dosage unit 
forms are prepared to provide from about 1also ismg to about 1000 mg, including 
from about 10 to about 500 mg, and including about 25-75 mg of the essential 
1 0 active ingredient or a combination of essential ingredients per dosage unit form. 
The precise dosage can be empirically determined. 

The active ingredient can be administered at once, or can be divided into a 
number of smaller doses to be administered at intervals of time. It is understood 
that the precise dosage and duration of treatment is a function of the disease 
1 5 being treated and can be determined empirically using known testing protocols or 
by extrapolation from in vivo or in vitro test data. It is to be noted that concentra- 
tions and dosage values also can vary with the severity of the condition to be 
alleviated. It is to be further understood that for any particular subject, specific 
dosage regimens should be adjusted over time according to the individual need 
20 and the professional judgment of the person administering or supervising the 
administration of the compositions, and that the concentration ranges set forth 
herein are exemplary only and are not intended to limit the scope or use of the 
claimed compositions and combinations containing them. 

Pharmaceutical^ acceptable derivatives include acids, salts, esters, 
25 hydrates, solvates and prodrug forms. The derivative is typically selected such 
that its pharmacokinetic properties are superior to the corresponding neutral 
compound. 

Thus, effective concentrations or amounts of one or more of the 
compounds provided herein or pharmaceutically acceptable derivatives thereof are 
30 mixed with a suitable pharmaceutical carrier or vehicle for systemic, topical or 
local administration to form pharmaceutical compositions. Compounds are 
included in an amount effective for ameliorating or treating the disorder for which 
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treatment is contemplated. The concentration of active compound in the 
composition depends on absorption, inactivation, excretion rates of the active 
compound, the dosage schedule, amount administered, particular formulation as 
well as other factors known to those of skill in the art. 
5 Solutions or suspensions used for parenteral, intradermal, subcutaneous, or 

topical application can include any of the following components: a sterile diluent, 
such as water for injection, saline solution, fixed oil, polyethylene glycol, 
glycerine, propylene glycol or other synthetic solvent; antimicrobial agents, such 
as benzyl alcohol and methyl parabens; antioxidants, such as ascorbic acid and 

10 sodium bisulfite; chelating agents, such as ethylenediaminetetraacetic acid 

(EDTA); buffers, such as acetates, citrates and phosphates; and agents for the 
adjustment of tonicity such as sodium chloride or dextrose. Parenteral 
preparations can be enclosed in ampules, disposable syringes or single or multiple 
dose vials made of glass, plastic or other suitable material. 

15 In instances in which the compounds exhibit insufficient solubility, 

methods for soiubilizing compounds can be used. Such methods are known to 
those of skill in this art, and include, but are not limited to, using cosolvents, such 
as dimethylsulfoxide (DMSO), using surfactants, such as Tween®, or dissolution in 
aqueous sodium bicarbonate. Derivatives of the compounds, such as prodrugs of 

20 the compounds also can be used in formulating effective pharmaceutical 

compositions. For ophthalmic indications, the compositions are formulated in an 
ophthalmically acceptable carrier. For the ophthalmic uses herein, local 
administration, either by topical administration or by injection are contemplated. 
Time release formulations are also desirable. Typically, the compositions are 

25 formulated for single dosage administration, so that a single dose administers an 
effective amount. 

Upon mixing or addition of the compound with the vehicle, the resulting 
mixture can be a solution, suspension, emulsion or other composition. The form 
of the resulting mixture depends upon a number of factors, including the intended 
30 mode of administration and the solubility of the compound in the selected carrier 
or vehicle. If necessary, pharmaceutical^ acceptable salts or other derivatives of 
the compounds are prepared. 
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The compound is included in the pharmaceutical^ acceptable carrier in an 
amount sufficient to exert a therapeutically useful effect in the absence of 
undesirable side effects on the patient treated. It is understood that number and 
degree of side effects depends upon the condition for which the compounds are 
5 administered. For example, certain toxic and undesirable side effects are tolerated 
when treating life-threatening illnesses that would not be tolerated when treating 
disorders of lesser consequence. 

The compounds also can be mixed with other active materials, that do not 
impair the desired action, or with materials that supplement the desired action 
10 known to those of skill in the art. The formulations of the compounds and agents 
for use herein include those suitable for oral, rectal, topical, inhalational, buccal 
(e.g., sublingual), parenteral (e.g., subcutaneous, intramuscular, intradermal, or 
intravenous), transdermal administration or any route. The most suitable route in 
any given case depends on the nature and severity of the condition being treated 
15 and on the nature of the particular active compound which is being used. The 

formulations are provided for administration to humans and animals in unit dosage 
forms, such as tablets, capsules, pills, powders, granules, sterile parenteral 
solutions or suspensions, and oral solutions or suspensions, and oil-water 
emulsions containing suitable quantities of the compounds or pharmaceuticaliy 
20 acceptable derivatives thereof. The pharmaceuticaliy therapeutically active 

compounds and derivatives thereof are typically formulated and administered in 
unit-dosage forms or multiple-dosage forms. Unit-dose forms as used herein 
refers to physically discrete units suitable for human and animal subjects and 
packaged individually as is known in the art. Each unit-dose contains a 
25 predetermined quantity of the therapeutically active compound sufficient to 
produce the desired therapeutic effect, in association with the required 
pharmaceutical carrier, vehicle or diluent. Examples of unit-dose forms include 
ampoules and syringes and individually packaged tablets or capsules. Unit-dose 
forms can be administered in fractions or multiples thereof. A multiple-dose form 
30 is a plurality of identical unit-dosage forms packaged in a single container to be 
administered in segregated unit-dose form. Examples of multiple-dose forms 
include vials, bottles of tablets or capsules or bottles of pints or gallons. Hence, 
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multiple dose form is a multiple of unit-doses which are not segregated in 
packaging. 

The composition can contain along with the active ingredient: a diluent 
such as lactose, sucrose, dicalcium phosphate, or carboxymethylcellulose; a 
5 lubricant, such as magnesium stearate, calcium stearate and talc; and a binder 
such as starch, natural gums, such as gum acacia, gelatin, glucose, molasses, 
polyvinylpyrroiidine, celluloses and derivatives thereof, povidone, crospovidones 
and other such binders known to those of skill in the art. Liquid pharmaceutical^ 
administrate compositions can, for example, be prepared by dissolving, 

10 dispersing, or otherwise mixing an active compound as defined above and optional 
pharmaceutical adjuvants in a carrier, such as, for example, water, saline, 
aqueous dextrose, glycerol, glycols, ethanol, and the like, to thereby form a 
solution or suspension. If desired, the pharmaceutical composition to be 
administered also can contain minor amounts of nontoxic auxiliary substances 

1 5 such as wetting agents, emulsifying agents, or solubilizing agents, pH buffering 
agents and the like, for example, acetate, sodium citrate, cyclodextrine 
derivatives, sorbitan monolaurate, triethanolamine sodium acetate, triethanolamine 
oleate, and other such agents. Methods of preparing such dosage forms are 
known, or will be apparent, to those skilled in this art (see, e.g., Remington's 

20 Pharmaceutical Sciences, Mack Publishing Company, Easton, Pa., 15th Edition, 
1975). The composition or formulation to be administered contains a quantity of 
the active compound in an amount sufficient to alleviate the symptoms of the 
treated subject. 

Dosage forms or compositions containing active ingredient in the range of 
25 0.005% to 100% with the balance made up from non-toxic carrier can be 

prepared. For oral administration, the pharmaceutical compositions can take the 
form of, for example, tablets or capsules prepared by conventional means with 
pharmaceutically acceptable excipients such as binding agents (e.g., 
pregelatinized maize starch, polyvinyl pyrrolidone or hydroxypropyl 
30 methylcellulose); fillers (e.g., lactose, microcrystalline cellulose or calcium 
hydrogen phosphate); lubricants (e.g., magnesium stearate, talc or silica); 
disintegrants (e.g., potato starch or sodium starch glycolate); or wetting agents 
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(e.g., sodium lauryl sulphate). The tablets can be coated by methods well-known 
in the art. 

The pharmaceutical preparation also can be in liquid form, for example, 
solutions, syrups or suspensions, or can be presented as a drug product for 
5 reconstitution with water or other suitable vehicle before use. Such liquid 
preparations can be prepared by conventional means with pharmaceutical^ 
acceptable additives such as suspending agents (e.g., sorbitol syrup, cellulose 
derivatives or hydrogenated edible fats); emulsifying agents [e.g., lecithin or 
acacia); non-aqueous vehicles (e.g., almond oil, oily esters, or fractionated 
10 vegetable oils); and preservatives (e.g., methyl or propyl-p-hydroxybenzoates or 
sorbic acid). 

Formulations suitable for rectal administration can be presented as unit 
dose suppositories. These can be prepared by admixing the active compound 
with one or more conventional solid carriers, for example, cocoa butter, and then 

1 5 shaping the resulting mixture. 

Formulations suitable for topical application to the skin or to the eye 
generally are formulated as an ointment, cream, lotion, paste, gel, spray, aerosol 
and oil. Carriers which can be used include vaseline, lanoline, polyethylene 
glycols, alcohols, and combinations of two or more thereof. The topical 

20 formulations can further advantageously contain 0.05 to 1 5 percent by weight of 
thickeners selected from among hydroxypropyl methyl cellulose, methyl cellulose, 
polyvinylpyrrolidone, polyvinyl alcohol, poly (alkylene glycols), poly/hydroxyalkyl, 
(meth)acrylates or poly(meth)acrylamides. A topical formulation is often applied 
by instillation or as an ointment into the conjunctival sac. It also can be used for 

25 irrigation or lubrication of the eye, facial sinuses, and external auditory meatus. It 
• also can be injected into the anterior eye chamber and other places. The topical 
formulations in the liquid state can be also present in a hydrophiiic three- 
dimensional polymer matrix in the form of a strip, contact lens, and the like from 
which the active components are released. 

30 For administration by inhalation, the compounds for use herein can be 

delivered in the form of an aerosol spray presentation from pressurized packs or a 
nebulizer, with the use of a suitable propellant, e.g., dichlorodifluoromethane, 
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trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable 
gas. In the case of a pressurized aerosol, the dosage unit can be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., 
gelatin, for use in an inhaler or insufflator can be formulated containing a powder 
5 mix of the compound and a suitable powder base such as lactose or starch. 

Formulations suitable for buccal (sublingual) administration include, for 
example, lozenges containing the active compound in a flavored base, usually 
sucrose and acacia or tragacanth; and pastilles containing the compound in an 
inert base such as gelatin and glycerin or sucrose and acacia. 

10 The compounds can be formulated for parenteral administration by 

injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection can be presented in unit dosage form, e.g., in ampules or in multi-dose 
containers, with an added preservative. The compositions can be suspensions, 
solutions or emulsions in oily or aqueous vehicles, and can contain formulatory 

1 5 agents such as suspending, stabilizing and/or dispersing agents. Alternatively, the 
active ingredient can be in powder form for reconstitution with a suitable vehicle, 
e.g., sterile pyrogen-free water or other solvents, before use. 

Formulations suitable for transdermal administration can be presented as 
discrete patches adapted to remain in intimate contact with the epidermis of the 

20 recipient for a prolonged period of time. Such patches suitably contain the active 
compound as an optionally buffered aqueous solution of, for example, 0.1 to 0.2 
M concentration with respect to the active compound. Formulations suitable for 
transdermal administration also can be delivered by iontophoresis [see, e.g., 
Pharmaceutical Research 3 (6), 318 (1986)) and typically take the form of an 

25 optionally buffered aqueous solution of the active compound. 

The pharmaceutical compositions also can be administered by controlled 
release means and/or delivery devices (see, e.g., in U.S. Patent Nos. 3,536,809; 
3,598,123; 3,630,200; 3,845,770; 3,847,770; 3,916,899; 4,008,719; 
4,687,610; 4,769,027; 5,059,595; 5,073,543; 5,120,548; 5,354,566; 

30 5,591,767; 5,639,476; 5,674,533 and 5,733,566). 

Desirable blood levels can be maintained by a continuous infusion of the 
active agent as ascertained by plasma levels. It should be noted that the 
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attending physician would know how to and when to terminate, interrupt or adjust 
therapy to lower dosage due to toxicity, or bone marrow, liver or kidney 
dysfunctions. Conversely, the attending physician would also know how to and 
when to adjust treatment to higher levels if the clinical response is not adequate 
5 (precluding toxic side effects). 

The efficacy and/or toxicity of the CVSP16 polypeptide inhibitor(s), alone 
or in combination with other agents also can be assessed by the methods known 
in the art (see, e.g., O'Reilly, Investigational New Drugs 75:5-13 (1997)). 

The active compounds or pharmaceutical^ acceptable derivatives can be 
10 prepared with carriers that protect the compound against rapid elimination from 
the body, such as time release formulations or coatings. 

Kits containing the compositions and/or the combinations with instructions 
for administration thereof are provided. The kit can further include a needle or 
syringe, typically packaged in sterile form, for injecting the complex, and/or a 
15 packaged alcohol pad. Instructions are optionally included for administration of 
the active agent by a clinician or by the patient. 

Finally, the compounds or CVSP16 polypeptides or protease domains 
thereof or compositions containing any of the preceding agents can be packaged 
as articles of manufacture containing packaging material, a compound or suitable 
20 derivative thereof provided herein, which is effective for treatment of diseases or 
disorders contemplated herein, within the packaging material, and a label that 
indicates that the compound or a suitable derivative thereof is for treating the 
diseases or disorders contemplated herein. The label can optionally include the 
disorders for which the therapy is warranted. 
25 L. Methods of treatment 

The compounds identified by the methods herein are used for treating or 
preventing neoplastic diseases in an animal, particularly a mammal, including a 
human, and are provided herein. In one embodiment, the method includes 
administering to a mammal an effective amount of an inhibitor of a CVSP16 
30 polypeptide, whereby the disease or disorder is treated or prevented. 

In an embodiment, the CVSP16 polypeptide inhibitor used in the treatment 
or prevention is administered with a pharmaceutical^ acceptable carrier or 
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excipient. The mammal treated can be a human. The inhibitors provided herein 
are those identified by the screening assays. In addition, antibodies and antisense 
nucleic acids or double-stranded RNA (dsRNA), such as RNAi, are contemplated. 
The treatment or prevention method can further include administering an 
5 anti-angiogenic treatment or agent or anti-tumor agent simultaneously with, prior 
to or subsequent to the CVSP16 polypeptide inhibitor, which can be any 
compound identified that inhibits the activity of a CVSP16 polypeptide. Such 
compounds include small molecule modulators, a natural product or derivative 
thereof, an antibody or a fragment or derivative thereof containing a binding 

10 region thereof against the CVSP16 polypeptide, an antisense nucleic acid or 

double-stranded RNA (dsRNA), such as RNAi, encoding a portion of the CVSP16 
polypeptide (or the complement thereof), and a nucleic acid containing at least a 
portion of a gene encoding the CVSP1 6 polypeptide into which a heterologous 
nucleotide sequence has been inserted such that the heterologous sequence 

1 5 inactivates the biological activity of at least a portion of the gene encoding the 
CVSP1 6 polypeptide, in which the portion of the gene encoding a CVSP1 6 
polypeptide flanks the heterologous sequence to promote homologous 
recombination with a genomic gene (or endogenous gene) encoding a CVSP16 
polypeptide. In addition, such molecules are generally less than about 1000 nt 

20 long. 

1 . Antisense treatment 

In a specific embodiment, as described hereinabove, CVSP16 polypeptide 
function is reduced or inhibited by CVSP16 polypeptide antisense nucleic acids, to 
treat or prevent neoplastic disease. The therapeutic or prophylactic use of nucleic 

25 acids of at least six nucleotides that are antisense to a gene or cDNA encoding 
CVSP1 6 polypeptide or a portion thereof. A CVSP1 6 polypeptide "antisense" 
nucleic acid as used herein refers to a nucleic acid capable of hybridizing to a 
portion of a CVSP1 6 polypeptide RNA {generally mRNA) by virtue of some 
sequence complementarity, and generally under high stringency conditions. The 

30 antisense nucleic acid can be complementary to a coding and/or noncoding region 
of a CVSP1 6 polypeptide mRNA. Such antisense nucleic acids have utility as 
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therapeutics that reduce or inhibit CVSP16 polypeptide function, and can be used 
in the treatment or prevention of disorders as described supra. 

The CVSP16 polypeptide antisense nucleic acids are of at least six 
nucleotides and are generally oligonucleotides (ranging from 6 to about 1 50 
5 nucleotides including 6 to 50 nucleotides). The antisense molecule can be 
complementary to ail or a portion of the protease domain. For example, the 
oligonucleotide is at least 10 nucleotides, at least 15 nucleotides, at least 100 
nucleotides, or at least 1 25 nucleotides. The oligonucleotides can be DNA or 
RNA or chimeric mixtures or derivatives or modified versions thereof, single- 

10 stranded or double-stranded. The oligonucleotide can be modified at the base 
moiety, sugar moiety and/or phosphate backbone. The oligonucleotide can 
include other appending groups such as peptides, or agents facilitating transport 
across the cell membrane (see, e.g., Letsinger et a/., Proc. Natl. Acad. Sci. U.S.A. 
56:6553-6556 (1989); Lemaitre et al. t Proc. Natl. Acad. Sci. U.S.A. 54:648-652 

15 (1987); PCT Publication No. WO 88/09810, published December 15, 1988) or 
blood-brain barrier (see, e.g., PCT Publication No. WO 89/10134, published April 
25, 1988), hybridization-triggered cleavage agents (see, e.g., Krol etal., 
BioTechniques 6:958-976 (1988)) or intercalating agents (see, e.g., Zon, Pharm. 
Res. 5:539-549 (1988)). 

20 The CVSP16 polypeptide antisense nucleic acid generally is an oligo- 

nucleotide, typically single-stranded DNA or RNA or an analog thereof or mixtures 
thereof. For example, the oligonucleotide includes a sequence antisense to a 
portion of a human CVSP16 polypeptide. The oligonucleotide can be modified at 
any position on its structure with substituents generally known in the art. 

25 The CVSP16 polypeptide antisense oligonucleotide can include at least one 

modified base moiety which is selected from the group including, but not limited 
to 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, 
xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 
5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, 

30 dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 

1- methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 

2- methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
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7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 
2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, 
pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 
5 4-thiouracil, 5-methyluracil, uracrl-5-oxyacetic acid methylester, uracil-5-oxyacetic 
acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, 
and 2,6-diaminopurine. 

In another embodiment, the oligonucleotide includes at least one modified 
sugar moiety selected from the group including but not limited to arabinose, 

10 2-fluoroarabinose, xylulose, and hexose. The oligonucleotide can include at least 
one modified phosphate backbone selected from a phosphorothioate, a 
phosphorodithioate, a phosphoramidothioate, a phosphoramidate, a 
phosphordiamidate, a methylphosphonate, an alkyl phosphotriester, and a 
f ormacetal or analog thereof. 

15 The oligonucleotide can be an a-anomeric oligonucleotide. An o-anomeric 

oligonucleotide forms specific double-stranded hybrids with complementary RNA 
in which the strands run parallel to each other (Gautier et al. t NucL Acids Res. 
75:6625-6641 (1987)}. 

The oligonucleotide can be conjugated to another molecule, e.g., a peptide, 

20 hybridization triggered cross-linking agent, transport agent and hybridization- 
triggered cleavage agent. 

The oligonucleotides can be synthesized by standard methods known in 
the art, e.g. by use of an automated DNA synthesizer (such as are commercially 
available from Biosearch, Applied Biosystems, and other sources). As examples, 

25 phosphorothioate oligonucleotides can be synthesized by the method of Stein et 
a/. [NucL Acids Res. 76:3209 (1988)), methylphosphonate oligonucleotides can 
be prepared by use of controlled pore glass polymer supports (Sarin et a/., Proc. 
Natl. Acad. Sci. U.S.A. 85:7448-7451 (1988)), and others. 

In a specific embodiment, the CVSP1 6 polypeptide antisense 

30 oligonucleotide includes catalytic RNA or a ribozyme (see, e.g., PCT International 
Publication WO 90/1 1364, published October 4, 1990; Sarver eta!., Science 
247:1222-1225 (1990)). In another embodiment, the oligonucleotide is a 2'-0- 
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methylribonucleotide (Inoue et al. ff Nucl. Acids Res. 75:6131-6148 (1987)), or a 
chimeric RNA-DNA analogue (Inoue etal., FEBS Lett. 275:327-330 (1987)). 
Alternatively, the oligonucleotide can be double-stranded RNA (dsRNA) such as 
RNAi. 

5 In an alternative embodiment, the CVSP1 6 polypeptide antisense nucleic 

acid is produced intracellular^ by transcription from an exogenous sequence. For 
example, a vector can be introduced in vivo such that it is taken up by a cell, 
within which cell the vector or a portion thereof is transcribed, producing an 
antisense nucleic acid (RNA). Such a vector would contain a sequence encoding 

10 the CVSP16 polypeptide antisense nucleic acid. Such a vector can remain 

episomal or become chromosomally integrated, as long as it can be transcribed to 
produce the desired antisense RNA. Such vectors can be constructed by 
recombinant DNA technology methods standard in the art. Vectors can be 
plasmid, viral, or others known in the art, used for replication and expression in 

15 mammalian cells. Expression of the sequence encoding the CVSP16 polypeptide 
antisense RNA can be by any promoter known in the art to act in mammalian, 
including human, cells. Such promoters can be inducible or constitutive. Such 
promoters include but are not limited to: the SV40 early promoter region (Bernoist 
and Chambon, Nature 250:304-310 (1981), the promoter contained in the 3' long 

20 terminal repeat of Rous sarcoma virus (Yamamoto et al., Ceii 22:787-797 (1980), 
the herpes thymidine kinase promoter (Wagner etal., Proc. Natl. Acad. ScL 
U.S.A. 75:1441-1445 (1981), the regulatory sequences of the metallothionein 
gene (Brinster etal., Nature 296:39-42 (1982)). 

The antisense nucleic acids include sequence complementary to at least a 

25 portion of an RNA transcript of a CVSP1 6 polypeptide gene, including a human 
CVSP16 polypeptide gene. Absolute complementarily is not required. 

The amount of CVSP1 6 polypeptide antisense nucleic acid that is effective 
in the treatment or prevention of neoplastic disease depends on the nature of the 
disease, and can be determined empirically by standard clinical techniques. 

30 Where possible, it is desirable to determine the antisense cytotoxicity in cells in 
vitro, and then in useful animal model systems prior to testing and use in humans. 
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2. RNA interference 

RNA interference (RNAi) (see, e.g. Chuang et al. (2000) Proc. Natl. Acad. 
Sci. U.S.A. S7:4985) can be employed to inhibit the expression of a gene 
encoding a CVSP16. Interfering RNA (RNAi) fragments, particularly double- 
5 stranded (ds) RNAi, can be used to generate loss-of-CVSP1 6 function. Methods 
relating to the use of RNAi to silence genes in organisms including, mammals, C. 
elegans, Drosophila and plants, and humans are known (see, e.g., Fire eta/. 
(1998) Nature 357:806-811 Fire (1999) Trends Genet. 75:358-363; Sharp (2001) 
Genes Dev. 75:485-490; Hammond, et ai. (2001 ) Nature Rev. Genet 2'^ 10-1 1 1 9; 

10 Tuschl (2001) Chem. Biochem. 2:239-245; Hamilton et al. (1999) Science 

255:950-952; Hammond et al. (2000) Nature 404:293-296; Zamore et al. (2000) 
Cell 707:25-33; Bernstein et al. (2001) Nature 409: 363-366; Elbashir et al. 
(2001) Genes Dev. 75:188-200; Elbashir et al. (2001) Nature 47 7:494-498; 
International PCT application No. WO 01/29058; International PCT application No. 

1 5 WO 99/32619). By selecting appropriate sequences, expression of dsRNA can 
interfere with accumulation of endogenous mRNA encoding a CVSP16. 

Double-stranded RNA (dsRNA)-expressing constructs are introduced into a 
host, such as an animal or plant. This can be accomplished by any of numerous 
methods known in the art, for example by including it in a replicable vector, such 

20 as a viral vector (see discussion below), that remains episomal or integrates into 
the genome. The dsRNA can be introduced into an appropriate nucleic acid 
expression vector and administering it so that it becomes intracellular, e.g., by 
infection using a defective or attenuated retroviral or other viral vector (see U.S. 
Patent No. 4,980,286). Other methods include, but are not limited to, direct 

25 injection of naked DNA, using microparticle bombardment (e.g., a gene gun; 
Biolistic, Dupont), coating with lipids or cell-surface receptors or transfecting 
agents, encapsulation in liposomes, microparticles, or microcapsules, 
administering it in linkage to a peptide which is known to enter the nucleus, 
administering it in linkage to a ligand subject to receptor-mediated endocytosis 

30 (see e.g., Wu and Wu, J. Biol. Chem. 252:4429-4432 (1 987)) (which can be 

used to target cell types specifically expressing the receptors) and other methods. 
In other methods, a nucleic acid-ligand complex can be formed in which the ligand 
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is a fusogenic viral peptide to disrupt endosomes, allowing the nucleic acid to 
avoid lysosomal degradation. In other methods, the nucleic acid can be targeted 
in vivo for cell specific uptake and expression, by targeting a specific receptor 
(see, e.g., Published International PCT application Nos. WO 92/06180, dated April 
5 16,1 992 (Wu et a!.); WO 92/22635, dated December 23, 1 992 (Wilson et al.); 
WO92/20316, dated November 26, 1992 (Findeis et al.}; W093/14188, dated 
July 22, 1993 (Clarke era/.), WO 93/20221 , dated October 14, 1993 (Young)). 
Alternatively, the nucleic acid can be introduced intracellularly and incorporated 
within host cell DNA for expression, by homologous recombination (Koller and 
1 0 Smithies, Proc. Natl. Acad. Sci. USA 85:8932-8935 (1 989); Zijlstra et al.. Nature 

342:435-438 (1989)). 

RNAi can be used to inhibit expression in vitro or in vivo. Regions include 
at least about 21 (or 21) nucleotides that are selective (i.e. unique) for CVSP16 
are used to prepare the RNAi. Smaller fragments of about 21 nucleotides can be 

1 5 transformed directly {i.e., in vitro or in vivo) into cells; larger RNAi dsRNA 

molecules are generally introduced using vectors that encode them. dsRNA 
molecules are at least about 21 bp long or longer, such as 50, 100, 150, 200 and 
longer. Methods, reagents and protocols for introducing nucleic acid molecules in 
to cells in vitro and in vivo are known to those of skill in the art. 

20 3. Gene Therapy 

In an exemplary embodiment, nucleic acids that include a sequence of 
nucleotides encoding a CVSP1 6 polypeptide or functional domains or derivative 
thereof, are administered to promote CVSP1 6 polypeptide function, by way of 
gene therapy. In this embodiment, the nucleic acid produces an encoded protein 

25 (or the nucleic acid or encoded RNA) that mediates a therapeutic effect by 

promoting CVSP1 6 polypeptide function. Any of the methods for gene therapy 
available in the art can be used (see, Goldspiel eta/., Clinical Pharmacy 72:488- 
505 (1993); Wu and Wu, Biotherapy 3:87-95 (1991); Tolstoshev, An. Rev. 
Pharmacol. Toxicol. 32:573-596 (1993); Mulligan, Science 250:926-932 (1993); 

30 and Morgan and Anderson, An. Rev. Biochem. 52:191-217 (1993); TIBTECH 
///5;:155-215 (1993). 
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For example, one therapeutic composition for gene therapy includes a 
CVSP1 6 polypeptide-encoding nucleic acid that is part of an expression vector 
that expresses a CVSP1 6 polypeptide or domain, fragment or chimeric protein 
thereof in a suitable host. In particular, such a nucleic acid has a promoter 
5 operably linked to the CVSP1 6 polypeptide coding region, the promoter being 
inducible or constitutive, and, optionally, tissue-specific. In another particular 
embodiment, a nucleic acid molecule is used in which the CVSP16 polypeptide 
coding sequences and any other desired sequences are flanked by regions that 
promote homologous recombination at a desired site in the genome, thus 

10 providing for intrachromosomal expression of the SP protein nucleic acid (Koller 
and Smithies, Proc. Natl. Acad. ScL USA 85:8932-8935 (1989); Zijlstra etal., 
Nature 342:435-438 (1989)}. 

Delivery of the nucleic acid into a patient can be either direct, in which 
case the patient is directly exposed to the nucleic acid or nucleic acid-carrying 

1 5 vector, or indirect, in which case, cells are first transformed with the nucleic acid 
in vitro, then transplanted into the patient. These two approaches are known, 
respectively, as in vivo or ex vivo gene therapy. 

In a specific embodiment, the nucleic acid is directly administered in vivo, 
and it is expressed to produce the encoded product. This can be accomplished by 

20 any of numerous methods known in the art, e.g., by constructing it as part of an 
appropriate nucleic acid expression vector and administering it so that it becomes 
intracellular, e.g., by infection using a defective or attenuated retroviral or other 
viral vector (see U.S. Patent No. 4,980,286), or by direct injection of naked DNA, 
or by use of microparticle bombardment (e.g., a gene gun; Biolistic, Dupont), or 

25 coating with lipids or cell-surface receptors or transfecting agents, encapsulation 
in liposomes, microparticles, or microcapsules, or by administering it in linkage to 
a peptide which is known to enter the nucleus, by administering it in linkage to a 
ligand subject to receptor-mediated endocytosis (see e.g., Wu and Wu, J. Biol. 
Chem. 262:4429-4432 (1987)) (which can be used to target cell types 

30 specifically expressing the receptors). In another embodiment, a nucleic acid- 
ligand complex can be formed in which the ligand is a fusogenic viral peptide to 
disrupt endosomes, allowing the nucleic acid to avoid lysosomal degradation. In 
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yet another embodiment, the nucleic acid can be targeted in vivo for eel! specific 
uptake and expression, by targeting a specific receptor (see, e.g., Published 
International PCT application Nos. WO 92/06180 dated April 16, 1992 (Wu et 
a/.); WO 92/22635 dated December 23, 1992 (Wilson eta/.); WO92/20316, 
5 November 26, 1992 (Findeis etal.), W093/14188, July 22, 1993 (Clarke et al.), 
WO 93/20221, October 14, 1993 (Young)). Alternatively, the nucleic acid can be 
introduced intracellular^ and incorporated within host cell DNA for expression, by 
homologous recombination (Koller and Smithies, Proc. Natl. Acad. Sci. USA 
85:8932-8935 (1989); Zijlstra et at., Nature 342:435-438 (1989)). 

10 in a specific embodiment, a viral vector that contains the CVSP16 

polypeptide nucleic acid is used. For example, a retroviral vector can be used (see 
Miller et al., Meth. Enzymol. 277:581-599 (1993)). These retroviral vectors have 
been modified to delete retroviral sequences that are not necessary for packaging 
of the viral genome and integration into host cell DNA. The CVSP16 polypeptide 

1 5 nucleic acid to be used in gene therapy is cloned into the vector, which facilitates 
delivery of the gene into a patient. More detail about retroviral vectors can be 
found in Boesen et al., Biotherapy 6:291-302 (1994), which describes the use of 
a retroviral vector to deliver the mdrl gene to hematopoietic stem cells in order to 
make the stem cells more resistant to chemotherapy. Other references illustrating 

20 the use of retroviral vectors in gene therapy are: Clowes etaL, J. Clin. Invest. 
33:644-651 (1994); Kiem etaL, Blood 53:1467-1 473 (1994); Salmons and 
Gunzberg, Human Gene Therapy 4:129-141 (1993); and Grossman and Wilson, 
Curr. Opin. in Genetics and Devel. 3:110-114 (1993). 

Adenoviruses are other viral vectors that can be used in gene therapy. 

25 Adenoviruses are especially attractive vehicles for delivering genes to respiratory 
epithelia. Adenoviruses naturally infect respiratory epithelia where they cause a 
mild disease. Other targets for adenovirus-based delivery systems are liver, the 
centra! nervous system, endothelial cells, and muscle. Adenoviruses have the 
advantage of being capable of infecting non-dividing cells. Kozarsky and Wilson, 

30 Current Opinion in Genetics and Development 3:499-503 (1993) present a review 
of adenovirus-based gene therapy. Bout et al., Human Gene Therapy 5:3-10 
(1994) demonstrated the use of adenovirus vectors to transfer genes to the 
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respiratory epithelia of rhesus monkeys. Other instances of the use of 
adenoviruses in gene therapy can be found in Rosenfeld et al., Science 252-AZ\~ 
434 (1991); Rosenfeld et el.., Cell 65:1 43-1 55 (1992); and Mastrangeli et al., J. 
Clin. Invest 91:225-234 (1993). Adeno-associated virus (AAV) also is used in 
5 gene therapy (Walsh etaL, Proc. Soc. Exp. Biol. Med. 204:289-300 (1993). 

Another approach to gene therapy involves transferring a gene to cells in 
tissue culture by such methods as electroporation, lipofection, calcium phosphate 
mediated transfection, or viral infection. Usually, the method of transfer includes 
the transfer of a selectable marker to the cells. The cells are then placed under 

1 0 selection to isolate those ceils that have taken up and are expressing the 
transferred gene. Those cells are then delivered to a patient. 

In this embodiment, the nucleic acid is introduced into a cell prior to 
administration in vivo of the resulting recombinant celt. Such introduction can be 
carried out by any method known in the art, including but not limited to 

1 5 transfection, electroporation, microinjection, infection with a viral or bacteriophage 
vector containing the nucleic acid sequences, cell fusion, chromosome-mediated 
gene transfer, microcell-mediated gene transfer, spheroplast fusion and other 
delivery methods. Numerous techniques are known in the art for the introduction 
of foreign genes into cells (see e.g., Loeffler and Behr, Meth. Enzymol. 2/7:599- 

20 618 (1993); Cohenef al., Meth. Enzymol. 2/7:618-644 (1993); Cline, Pharmac. 
Ther. 23:69-92 (1985)) and can be used, provided that the necessary 
developmental and physiological functions of the recipient cells are not disrupted. 
The technique should provide for the stable transfer of the nucleic acid to the cell, 
so that the nucleic acid is expressible by the cell and generally heritable and 

25 expressible by its cell progeny. 

The resulting recombinant cells can be delivered to a patient by various 
methods known in the art. In an embodiment, epithelial cells are injected, e.g., 
subcutaneously. In another embodiment, recombinant skin cells can be applied as 
a skin graft onto the patient. Recombinant blood cells {e.g., hematopoietic stem 

30 or progenitor cells) can be administered intravenously. The amount of cells 
envisioned for use depends on the desired effect, patient state and other 
parameters, and can be determined by one skilled in the art. 
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Cells into which a nucleic acid can be introduced for purposes of gene 
therapy encompass any desired, available cell type, and include but are not limited 
to epithelial cells, endothelial cells, keratinocytes, fibroblasts, muscle cells, 
hepatocytes; blood cells such as Talso islymphocytes, Balso islymphocytes, 
5 monocytes, macrophages, neutrophils, eosinophils, megakaryocytes, 

granulocytes; various stem or progenitor cells, in particular hematopoietic stem or 
progenitor cells, e.g. , such as stem cells obtained from bone marrow, umbilical 
cord blood, peripheral blood, fetal liver, and other sources thereof. 

For example, a cell used for gene therapy is autologous to the patient. In 

10 an embodiment in which recombinant cells are used in gene therapy, a CVSP16 
polypeptide nucleic acid is introduced into the cells such that it is expressible by 
the cells or their progeny, and the recombinant cells are then administered in vivo 
for therapeutic effect. In a specific embodiment, stem or progenitor cells are 
used. Any stem and/or progenitor cells which can be isolated and maintained in 

1 5 vitro can potentially be used in accordance with this embodiment. Such stem 
cells include but are not limited to hematopoietic stem cells (HSC), stem cells of 
epithelial tissues such as the skin and the lining of the gut, embryonic heart 
muscle cells, liver stem cells (PCT Publication WO 94/08598, dated April 28, 
1994), and neural stem cells (Stemple and Anderson, Ceii 77:973-985 (1992)). 

20 Epithelial stem cells (ESCs) or keratinocytes can be obtained from tissues 

such as the skin and the lining of the gut by known procedures (Rheinwald, Meth. 
Cell Bio. 27A:223 (1980)). In stratified epithelial tissue such as the skin, renewal 
occurs by mitosis of stem cells within the germinal layer, the layer closest to the 
basal lamina. Stem cells within the lining of the gut provide for a rapid renewal 

25 rate of this tissue. ESCs or keratinocytes obtained from the skin or lining of the 
gut of a patient or donor can be grown in tissue culture (Rheinwald, Meth. Cell 
Bio. 2M:229 (1980); Pittelkow and Scott, Cano Clinic Proc. 57:771 (1986)). If 
the ESCs are provided by a donor, a method for suppression of host versus graft 
reactivity [e.g., irradiation, drug or antibody administration to promote moderate 

30 immunosuppression) also can be used. 

With respect to hematopoietic stem cells (HSC), any technique which 
provides for the isolation, propagation, and maintenance in vitro of HSC can be 
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used in this embodiment. Techniques by which this can be accomplished include 
(a) the isolation and establishment of HSC cultures from bone marrow cells 
isolated from the future host, or a donor, or (b) the use of previously established 
long-term HSC cultures, which can be allogeneic or xenogeneic. Non-autologous 
5 HSC generally are used with a method of suppressing transplantation immune 
reactions of the future host/patient. In a particular embodiment, human bone 
marrow cells can be obtained from the posterior iliac crest by needle aspiration 
(see, e.g., Kodo etal.,J. Clin: Invest. 73:1377-1384 (1984)). For example, the 
HSCs can be made highly enriched or in substantially pure form. This enrichment 

1 0 can be accomplished before, during, or after long-term culturing, and can be done 
by any techniques known in the art. Long-term cultures of bone marrow cells can 
be established and maintained by using, for example, modified Dexter cell culture 
techniques (Dexter etal, J. Cell Physiol. 37:335 (1977) or Witlock-Witte culture 
techniques (Witlock and Witte, Proc. Natl. Acad. Sci. USA 75:3608-3612 

15 (1982)). 

In a specific embodiment, the nucleic acid to be introduced for purposes of 
gene therapy includes an inducible promoter operably linked to the coding region, 
such that expression of the nucleic acid is controllable by controlling the presence 
or absence of the appropriate inducer of transcription. 

20 3. Prodrugs 

A method for treating tumors is provided. The method is practiced by 
administering a prodrug that is cleaved at a specific site by a CVSP1 6 to release 
an active drug or a precursor that can be converted to active drug in vivo. Upon 
contact with a cell that expresses CVSP1 6 activity, the prodrug is converted into 

25 an active drug. The prodrug can be a conjugate that contains the active agent, 
such as an anti-tumor drug, such as a cytotoxic agent, or other therapeutic agent 
(TA), linked to a substrate for the targeted CVSP16, such that the drug or agent is 
inactive or unable to enter a cell, in the conjugate, but is activated upon cleavage. 
The prodrug, for example, can contain an oligopeptide, typically a relatively short, 

30 less than about 10 amino acids peptide, that is proteoiytically cleaved by the 
targeted CVSP16. Cytotoxic agents, include, but are not limited to, alkylating 
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agents, antiproliferative agents and tubulin binding agents. Others include, vinca 
drugs, mitomycins, bleomycins and taxanes. 
M. Animal models 

Transgenic animal models and animals, such as rodents, including mice and 
5 rats, cows, chickens, pigs, goats, sheep, monkeys, including gorillas, and other 
primates, are provided herein. In particular, transgenic non-human animals that 
contain heterologous nucleic acid encoding a CVSP16 polypeptide or a transgenic 
animal in which expression of the polypeptide has been altered, such as by 
replacing or modifying the promoter region or other regulatory region of the 
10 endogenous gene are provided. Such an animal can by produced by promoting 

recombination between endogenous nucleic acid and an exogenous CVSP1 6 gene 

■ 

that could be over-expressed or mis-expressed, such as by expression under a 
strong promoter, via homologous or other recombination event. 

Transgenic animals can be produced by introducing the nucleic acid using 

15 any known method of delivery, including, but not limited to, microinjection, 

lipofection and other modes of gene delivery into a germline cell or somatic cells, 
such as an embryonic stem cell. Typically the nucleic acid is introduced into a 
cell, such as an embryonic stem cell (ES), followed by injecting the ES cells into a 
blastocyst, and implanting the blastocyst into a foster mother, which is followed 

20 by the birth of a transgenic animal. Generally, introduction of a 

heterologous nucleic acid molecule into a chromosome of the animal occurs by a 
recombination between the heterologous CVSP1 6-encoding nucleic acid and 
endogenous nucleic acid. The heterologous nucleic acid can be targeted to a 

specific chromosome. 

25 In some instances, knockout animals can be produced. Such an animal 

can be initially produced by promoting homologous recombination between a 
CVSP16 polypeptide gene in its chromosome and an exogenous CVSP16 
polypeptide gene that has been rendered biologically inactive (typically by 
insertion of a heterologous sequence, e.g., an antibiotic resistance gene). In one 

30 embodiment, this homologous recombination is performed by transforming 
embryo-derived stem (ES) cells with a vector containing the insertionally 
inactivated CVSP16 polypeptide gene, such that homologous recombination 
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occurs, followed by injecting the ES cells into a blastocyst, and implanting the 
blastocyst into a foster mother, followed by the birth of the chimeric animal 
t {"knockout animal") in which a CVSP16 polypeptide gene has been inactivated 
(see Capecchi, Science 244-A 288-1 292 (1989)). The chimeric animal can be bred 
5 to produce homozygous knockout animals, which can then be used to produce 
additional knockout animals. Knockout animals include, but are not limited to, 
mice, hamsters, sheep, pigs, cattle, and other non-human mammals. For 
example, a knockout mouse is produced. The resulting animals can serve as 
models of specific diseases, such as cancers, that exhibit under-expression of a 

10 CVSP16 polypeptide. Such knockout animals can be used as animal models of 
such diseases e.g., to screen for or test molecules for the ability to treat or 
prevent such diseases or disorders. 

Other types of transgenic animals also can be produced, including those 
that over-express the CVSP16 polypeptide. Such animals include "knock-in" 

1 5 animals that are animals in which the normal gene is replaced by a variant, such 
as a mutant, an over-expressed form, or other form. For example, one species', 
such as a rodent's endogenous gene can be replaced by the gene from another 
species, such as from a human. Animals also can be produced by non- 
homologous recombination into other sites in a chromosome; including animals 

20 that have a plurality of integration events. 

After production of the first generation transgenic animal, a chimeric animal 
can be bred to produce additional animals with over-expressed or mis-expressed 
CVSP1 6 polypeptides. Such animals include, but are not limited to, mice, 
hamsters, sheep, pigs, cattle and other non-human mammals. The resulting 

25 animals can serve as models of specific diseases, such as cancers, that exhibit 

over-expression or mis-expression of a CVSP16 polypeptide. Such animals can be 
used as animal models of such diseases e.g., to screen for or test molecules for 
the ability to treat or prevent such diseases or disorders. In a specific 
embodiment, a mouse with over-expressed or mis-expressed CVSP1 6 polypeptide 

30 is produced. 
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The following examples are included for illustrative purposes only and are 
not intended to limit the scope of the invention. 

EXAMPLE 1 

Identification of CVSP16 
5 The protein sequence of the protease domain of matriptase (MTSP1 ; 

accession number AF1 18224) was used to search the human HTGS (High 
Throughput Genomic Sequence) database using the blastn algorithm 
(www.ncbi.nlm.nih.gov/BLAST). This search and alignment algorithm compares a 
protein query sequence against a nucleotide sequence database dynamically 

10 translated in all six reading frames (both strands). Among the proteases identified 
was the protease designated herein as CVSP16. The partial protein sequence of 
the CVSP16 protease domain in the database shares 34% identity to the protease 
domain of matriptase. A search using the algorithm blastp 
(www.ncbi.nlm.nih.gov/BLAST ) indicated that the translated sequence of 

15 CVSP16 showed 34% identity to prostamin (BAB20376.1), 36% identity to corin 
(NP_006578.1), 36% identity to marapsin (NP_1 1 41 54.1 ), 35% identity to 
prostasin (NP_002764.1 ), 39% identity to transmembrane tryptase 
(NPJ336599.1) and 36% identity to serine protease 22 (NP_071402.1 ). Based 
on the incomplete and unordered human genome sequence 

20 (www.ncbi.nim.nih.gov/genome/seq), CVSP16 appears to be localized on 

chromosome 16 (locus: 16p13.3; clone accession number AC009088.7). A 
search of sequences deposited in GenBank showed that one entry 
(XMJD97026.1 ) corresponding to a hypothetical, genomic sequence-derived 
protein had homologous nucleotide sequence (74%) with CVSP16, although the 

25 reported translated protein sequence only had 10% homology to the partial 
CVSP16 polypeptide sequence. A search of the EST database showed the 
existence of two EST clones (AW450407 from human colon and Al 190509 from 
human fetal heart). Both EST clones shared 98-99% identical sequence to a 
portion of the CVSP1 6 protease domain sequence. 
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Cloning of CVSP16 from human small intestine cell line using RACE reactions 
Using the EST-derived cDNA sequence homologous to CVSP1 6, four 
oligonucleotide primers hybridizing within the protease domain sequence were 
designed and synthesized. The sequence for the 5' end primer was 
5 5'-CCCTCTGGGTAGCCAGCACACAGCATC-3' SEQ ID No. 7 and that of the 3' 
end primer was 5'-GCCATCGTGGTGCCGGCCAACTACAG-3' SEQ ID No. 8. The 
sequence for the nested 5' end primer was 

5'-GCACACAGCATCCCTGGCAATATCTGG-3' SEQ ID No. 9 and that of the 
nested 3'end primer was 5'-CGGCCAACTACAGCCAAGTGGAGCTG-3' SEQ ID 
10 No. 10. 

The first set of RACE primers were used to amplify cDNA fragments from 
human small intestine Marathon-ready cDNA library (catalog number 7426-1; 
www.clontech.com). Following this, nested RACE reactions were performed 
using the nested primers. Several DNA bands were detected in all RACE 

1 5 reactions. The pool of cDNA fragments larger than 500 bp was isolated by 2% 
agarose electrophoresis and purified from the nested 5'- and 3'-RACE reactions 
using the MinElute gel extraction kit (catalog number 28606; www.qiagen.com), 
then subcloned into an E. co// vector (pCR2.1TOPO; catalog no. K-4500-01; 
www.invitrogen.com) and transformed into E. co/i TOP10 cells 

20 (www.invitrogen.com). To identify clones that contained CVSP16 cDNA, colony 
hybridization was performed using a -300-bp cDNA probe amplified from the 
nested 5'- and 3'-RACE primers on the same small intestine Marathon cDNA 
library. Subsequent sequence analysis confirmed that the nucleotide sequence of 
these 5'- and 3'-RACE-derived clones matched that of the CVSP1 6 sequence 

25 using a fluorescent dye-based DNA sequencing method (catalog number 
4390244; ABI PRISM* BigDye™ Terminator v 3.0 Ready Reaction Cycle 
Sequencing Kits with AmpliTaq" DNA Polymerase, FS; 

home.appliedbiosystems.com). A methionine start codon was missing from the 
cDNA clone, but an in-frame stop codon could be found in the coding sequence; 
30 thus, the 5'-RACE product did not extend to the beginning of the coding sequence. 
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PCR amplification of the full-length cDNA encoding CVSP1 6 

A search of the Incyte proprietary database showed a clone (2284-56.2) 
that had homology to the CVSP1 6 partial sequence. Based on this sequence, 
several primer sets were designed and synthesized. To obtain the full-length 
5 CVSP1 6 cDNA, 2 pairs of gene-specific primers that amplify 2 overlapping 

portions of the CVSP1 6 sequence and the cDNA library made from human liver 
were used. The first pair of primers used was: 

5'-ATGGCCCGGCAGCTGCTCCTCCCCCTTGTG-3' SEQ ID No. 1 1 for the 5' end 

(putative start codon underlined) and 
10 5--CGGCTCCCGGGCAGGAAGTAGTGTTCCG-3' SEQ ID No. 1 2 for the 3' end. 

This primer pair amplified the initial half of the CVSP16 sequence. The second 
pair of primers used was: 5'-TGGGTCTTGGCACCTGCCAGCTGCTTTCTG-3' SEQ 
ID No. 15 at the 5' end and 

5'-GAAGGGGGAAGTGGTGCTGGGACCCTAG-3' SEQ ID No. 16 for the 3' end. 

1 5 This pair amplified the last half of the CVSP1 6 sequence and the 3'-end primer 

corresponds to the sequence downstream of the putative stop codon. Two cDNA 
fragments ( - 1 .3 and - 1 .4 kbp) were amplified using these 2 sets of primers. 
The PCR products were isolated by 2% agarose electrophoresis, purified using the 
MinElute gel extraction kit (www.qiagen.com) and subcloned into pCR2.1TOPO 

20 (www.invitrogen.com). Sequence analysis was performed to confirm the 

nucleotide sequence. 

The full-length coding region of CVSP1 6 was prepared by stitching the two 

cDNA fragments using PCR. A -2.3 kbp fragment was amplified, isolated by 1 % 

agarose electrophoresis, purified, subcloned and sequenced as described above. 
25 The sequence obtained from this gene-specific amplification of CVSP1 6 matched 

those sequences obtained from both 5'- and 3'-RACE reactions. In addition the 

missing 5' end containing a start codon was present. 

Gene expression profile of CVSP16 in normal, tumor tissues and cell lines 

To obtain information regarding the gene expression profile of the CVSP1 6 
30 transcript, a -370-bp CVSP16 cDNA fragment was used to probe a dot blot 

composed of polyA + RNAs extracted and purified from 76 different human tissues 

(Human Multiple Tissue Expression (MTE) Array; catalog no. 7775-1 ; 
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www.clontech.com). The cDNA probe was amplified from human small intestine 
Marathon cDNA library using the following primers: 5' end primer, 5'- 
CCCTCTGGGTAGCCAGCACACAGCATC-3' SEQ ID No. 1 7, and 3' end primer, 
5'-GCCATCGTGGTGCCGGCCAACTACAG-3' SEQ ID No. 18. The results 
5 indicate that the CVSP1 6 transcript is strongly expressed in several tissues 
including kidney, stomach, colon, spleen, thyroid gland, trachea and pituitary 
gland. The CVSP1 6 transcript also is found in many other tissues albeit at a 
lower level. Among tumor cell lines, the CVSP1 6 transcript is found (in 
decreasing signal intensity) in cervical Hela S3, lung A549, leukemia K-562, 

10 Burkitt's lymphoma Raji, leukemia HL-60, colorectal SW480, Burkitt's lymphoma 
Daudi and leukemia MOLT-4. To compare the expression profile of CVSP16 
transcript in a range of normal human and matched tumor tissues, a matched 
tumor/normal expression array (catalog number 7840-1 ; www.clontech.com) 
composed of 68 paired cDNA samples from individual patients was used. Results 

1 5 show that the CVSP1 6 transcript is expressed at a low level in a number of 
normal tissues including breast, prostate, cervix, uterus, colon, lung, small 
intestine, stomach, kidney and rectum, but is not differentially expressed in any of 
the matched tumors. 

Several SMART™ 5'-RACE cDNA libraries (catalog number K1811-1; 
20 www.clontech.com) prepared from normal breast, normal testes, normal prostate, 
prostate cancer cell lines and breast cancer cell lines were analyzed for the 
presence of CVSP1 6 transcript by RT-PCR using gene-specific primers. The 
primer sequences were: 

5 '-ATCGTG GTGCCGGCCAACTAC AGCC AAGTG-3' SEQ ID No. 19 for the 5' end 
25 primer and 5'-ACCCATCACCTGCTCCCGTATCCATGCCTC-3' SEQ ID No. 20 for 
the 3' end primer. The CVSP1 6 transcript was detected (in decreasing signal 
intensity) in normal breast, normal prostate, breast carcinoma cell line DU4475, 
prostate carcinoma cell line PC-3, prostate carcinoma cell line LNCaP, breast 
carcinoma cell line MDA-MB-231 , and breast carcinoma cell line MDA-MB-453. 
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Structural features 

The CVSP16 contains a signal peptide sequence (aa 1 to aa 23) and a 
trypsin-like serine protease domain designated herein as CVSP16 PD1 (aa 46 to aa 
286) characterized by the presence of a protease activation cleavage site 
5 (...R 46 ±! 47 VGGSNAQP..., where 1 indicates protease activation cleavage site) at 
the beginning of the domain and the catalytic triad residues {H 87 , D 139 and S 243 ) in 
3 highly-conserved regions of the catalytic domain. In addition, CVSP16 has an 
additional 465-amino acid sequence (aa 287 to aa 752) beyond the protease 
domain. Analysis of this 465-amino acid long region indicates the presence of a 

10 second protease domain (aa 323 to aa 550, designated herein as CVSP16 PD2). 
In this domain, however, the invariant catalytic histidine is replaced by a serine 
{S 363 ) residue, and the highly conserved SGGP sequence that contains the 
catalytic serine has been replaced with the sequence S 510 RWS. The starting 
residue is unusual, suggesting that cleavage may not be needed for activation. 

1 5 These sequences and differences from other protease domains indicate that the 
second protease domain has lower catalytic activity. 

CVSP16 has 8 putative /V-linked glycosylation sites (...N 92 GT..., 

. . . N 130 YS N 217 LT N 317 CT. . . , . . . N 369 SS N^AS N 421 LS . . . , 

...N 508 DS...). The following cysteine pairings are noted: C 72 -C 88 , C 173 -C 249 , C 206 - 

20 C 228 , C 239 -C 267 , C348-C 364 , (WC 516 , C 472 -C 494 and C 506 -C 534 . In addition, an 

unpaired cysteine (C 159 ) in the first protease domain should pair with C 38 . An 
unpaired cysteine (C 430 ) in the second protease domain should pair with C 325 or, 
less likely, C 318 . PD1 and PD2 have an additional Cys (C 

208 C 474 , respectively) 

that is unpaired or pairs with a Cys outside of each protease domain. The protein 

25 has a C-terminal domain beyond 550. 

Homology of CVSP1 6 to other serine proteases 

Clustal W alignment (using MacVector; version 6.5.3; www.accel- 
rys.com/products/macvector/index.html) of the derived CVSP1 6 full-length cDNA 
and protein sequences with those of the cDNA and protein sequences derived 

30 from the Incyte clone (228456.2) showed an 81% and 86% sequence identity, 
respectively. Alignment {bfastp; www.ncbi.nlm.nih.gov/BLAST) of the protease 
domain (minus the 465-amino acid extension at the C terminus) sequence of 
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CVSP16 shows 40% identity with that of human matriptase/MTSP-1 (accession 
number NP_068813), 40% identity with that of human prostamin (accession 
number BAB20376.1), 40% identity with that of human marapsin (accession 
number NP_1 14154.1), 39% identity with that of human enterokinase (accession 
5 number NP_002763.1 ), 40% identity with that of human prostasin (accession 

number NP_002764. 1 ), 39% identity with that of human corin (accession number 
NPJD06587.1 ), 44% identity with that of human transmembrane tryptase 
(accession number NP_036599.1), and 37% identity with that of human plasma 
kallikrein (accession number NP_000883.1 ). 

10 The CVSP16 and encoding nucleic acid has homology to a clones 

described in International PCT application No. WO 02/000860 (see SEQ ID No. 
1 1 1 therein) and to clones in International PCT application No. WO 02/046383, 
International PCT application No. W0 01075067 and EP 1 1 30094. The clones 
. and predicted encoded polypeptides described in the PCT applications and EP 

1 5 application, however, differ from the nucleic acid molecule encoding CVSP1 6 

polypeptides and the CVSP16 polypeptides provided herein. For example, each of 
the nucleic acid molecules described in International PCT application 
No. WO 02/00860 includes a sequence of nucleotides encoding the sequence of 
amino acids set forth in SEQ ID 21 herein, and the disclosed polypeptides include 

20 the sequence of amino acids set forth in SEQ ID 21 herein. None of the 

polypeptides provided herein include at least 5, 10, 15, 20 or more contiguous 
amino acids from SEQ ID No. 21, particularly between Gin 660 and Met 661 (SEQ 
ID No. 6) or between the corresponding amino acids in other CVSP16s. Hence 
the CVSP16s provided herein include the sequence Gln 660 Met 661 , particularly, the 

25 contiguous sequence Gly His Gin Met Thr Ser (see, SEQ ID No. 6, amino acids 
658-663). or Leu Pro Gin Gly His Gin Met Thr Ser Ala (see, SEQ ID No. 6, amino 
acids 655-664). 
Sequence analysis 

CVSP1 6 cDNA and protein sequences were analyzed using MacVector 

30 nucleic acid/protein sequence analysis program. The full length cDNA encoding 
CVSP16 is 2,293 bp long containing a 2,259-bp open reading frame, which 
translates to a 752-amino acid protein. The cDNA encoding the active protease 
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domain is 717 bp long which translates to a 239-amino acid domain. The G + C 
content of the CVSP16 cDNA is 68%. Attached are the cDNA sequence and the 
translated protein sequence of CVSP16 (see, also, SEQ ID Nos. 5 and 6). 
CVSP16 full length cDIMA and translated protein sequence 
5 Sequence Range: 1 to 2293 

10 20 30 40 50 60 

ATGGC CCGGCAGCTGCTC CTCC CC CTTGTGGTGCTTGTCATCAGTCC CATC CCAGGAGC C 
TACCGGGCCGTCGACGAGGAGGGGGAACAC CACGAACAGTAGTCAGGGTAGGGTC CTCGG 
MARQIiLLPLVVLVI SPIPGA> 

10 70 80 90 100 110 120 

TTCCAGGACrCAGCrCTCAGTCCTACCCAGGAAGAACCTGAAGATCTGGACTGCGGGCGC 
AAGGTCCTGAGTCGAGAGTCAGGATGGGTC CTTCTTGGACTTCTAGAC CTGACGCC CGCG 
FQDSALSPTQEEPEDLDCG R> 

130 140 150 160 170 180 

1 5 CCTGAGCCCTCGGCCCGCATCGTGGGGGGCTCAAACGCGCAGCCGGGCACCTGGCCTTGG 

GGACTCGGGAGCCGGGCGTAGCACCCCCCGAGTTTGCGCCTCGGCCCGTGGACCGGAACC 
PEPSARIVGGSNAQPGTWPW> 

190 200 210 220 230 240 

CAAGTGAGCCTGCACCATGGAGGTGGCCACATCTGCGGGGGCTCCCTCATCGCCCCCTCC 
20 GTTCACTCGGACGTGGTAC CTC CACCGGTGTAGACGC C CCCGAGGGAGTAGCGGGGGAGG 

QVSLHHGGGHICGGS3jIAPS> 

250 260 270 280 290 300 

TGGGTCCTCTCCGCCGCTCUICTGTTTCATGACGAATGGGACGCTGGAGCCCGCGGCCGAG 
AC CCAGGAGAGGCGGCGAGTGACAAAGTACTGCTTAC CCTGCGAC ctcgggcgc CGGCTC 
25 WVLSAAHCFMTNGTLEPAA E> 

310 320 330 340 350 360 

TGGTCGGTACTGCTGGGCGTGCACTCCCAGGACGGGCCCCTGGACGGCGCGCACACCCGC 
AC CAGC CATGACGAC CCGCACGTGAGGGTCCTGCC CGGGGACCTGC CGCGCGTGTGGGCG 
WSVLLGVHSQDGPLDGAHTR> 
30 370 380 390 400 410 420 

GCAGTGGCCGC CATCGTGGTGCCGGC CAACTACAGCCAAGTGGAGCTGGGCGCCGACCTG 
CGTCACCGGCGGTAGGACCACGGCCGGTTGATGTCGGTTCACCTCGACCCGCGGCTGGAC 
AVAAI VVPANY SQVELGAD L> 
430 440 450 460 470 480 

3 5 GC CCTGCTGCGC CTGGCCTCACCCGC CAGCCTGGGC CC CGCCGTGTGGC CTGTCTGCCTG 

CGGGACGACGCGGACCGGAGTGGGCGGTCGGACCCGGGGCGGCACACCGGACAGACGGAC 
ALLRLASPASLGPAVWPVCL> 
490 500 510 520 530 540 

CCCCGCGCCTCACACCGCTTCGTGCACGGC^CCGCCTGCTKX^ 
40 GGGGCGCGGAGTGTGGCGAAGCACGTGCCGTGGCGGACGACCCGGTGGCCGACCCCTCTG 

PRASHRFVHGTACWATGWG D> 

550 560 570 580 590 600 

GTCCAGGAGGCAGATCCTCTGCCTCTCCCCTGGGTGCTACAGGAAGTGGAGCTAAGGCTG 
CAGGTCCTCCGTCTAGGAGACGGAGAGGGGACCCACGATGTCCTTCACCTCGATTCCGAC 
45 VQEADPLPLPWVLQEVELR L> 

610 620 630 640 650 660 

CTGGGCGAGGCCACCTGTCAATGTCTCTACAGCCAGCCCGGTCCCTTCAACCTCACTCTC 
GACCCGCTCCGGTGGACAGTTACAGAGATGTCGGTCGGGCCAGGGAAGTTGGAGTGAGAG 
LGEATCQCLYSQPGP FNLT L> 
50 670 680 690 700 710 720 

CAGATATTGCC^GGGATGCTGTGTGCTGGCTACCCAGGGGGC 

GTCTATAACGGTCCCTACGACACACGACCGATGGGTCCCCCGGCGTC C CTGTGGACGGTC 
QI LPGMLCAGYPGGRRDTC Q> 
730 740 750 760 770 780 

5 5 GGTGACTCTGGGGGGCCCCTGGTCTGTGAGGAAGGCGGCCGCTGGTTCCAGGCAGGAATC 

CCACTGAGACCCCCCGGGGACCAGACACTCCTTCCGCCGGCGACCAAGGTCCGTCCTTAG 
GDSGGPLVCBEGGRWFQAGI> 

790 800 810 820 830 840 

AC CAGCTTTGGCTTTGGCTGTGGACGGAGAAACCGCC CTGGAGTTTTCACTGCTGTGG CT 
60 TCGTCGAAACCGAAACCGACACCTGCCTCTTTGGCGGGACCTCAAAAGTGA 

TS FGFGCGRRNRPGVFTAV A> 

850 860 870 880 . 890 900 

AC CT ATGAGG CATGG ATAC GGGAGCAGGTGATGGGTT CAGAG C CTGGGC CTG C CTTT C C C 
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TGGATACTCCGTAC CTATGC CCTCGTC GACTACCCAAGTCTCGGACCCGGACGGAAAGGG 
TYEAWIREQVMGSB P G P A F P> 
910 920 930 940 950 960 

AC CCAGC CC CAGAAGACCCAGTC AGATCC C CAGGAGCCCAGGGAGGAGAACTGCAC CATT 
5 TGGGTCGGGGTCTTCTGGGTCAGTCTAGGGGTC CTCGGGTC CCTC CTCTTGACGTGGTAA 

TQPQ KTQSDPQEPREENCTI> 
970 980 990 1000 1010 1020 

GGCCTGCCTGAGTGCGGGAAGGCCCCGCGGCCAGGGGCCTGGCCCTGGGAGGCCCAGGTG 
CGGGACGGACTCACGCCCTTCCGGGGCGCCGGTCCCCGGACCGGGACCCTCCGGGTCCAC 
10 ALPECGKAPRPGAWPWEAQ V> 

1030 1040 1050 1060 1070 1080 

ATGGTGCCAGGATCCAGACCCTGCCATGGGGCGCTGGTGTCTGAAAGCTGGGTCTTGGCA 

TACCACGGTCCTAGGTCTGGGACGGTACC C CGCGACC ACAGACTTTCGAC C CAGAACCGT 
MVPGSRPCHGAliVSESWVLA> 

15 1090 1100 1110 1120 1130 1140 

CCTGCCAGCTGCTTTCTGGACCCGAACAGCTCCGACAGCCCACCCCGCGACCTCGACGCC 

GGACGGTCGACGAAAGACCTGGGCTTGTCGAGGCTGTCGGGTGGGGCGCTGGAGCTGCGG 
PASCFLDPNSSDSPPRDLDA> 

1150 1160 1170 1180 1190 1200 

20 TGGCGCGTGCTGCTGCCCTCGCACCCGCGCGCGGAGCGGGTGGCGCGCCTGGTGCAGCAC 

ACCGCGCACGACGACGGGAGCGTGGGCGCGCGCCTCGCCCACCGCGCGGACCACGTCGTG 
WRVLLPSHPRAERVARLVQ H> 

1210 1220 1230 1240 1250 1260 

GAGAACGCTTCGTGGGACAACGCCCCGGACCTGGCGCTGCTGCAGCTGCGCACGCCCGTG 
2 5 CTCTTGCGAAGC^CCCTGTTGCGGGGCCTGGACCGCGACGACGTCGACGCGTGCGGGCAC 

ENASWDNAPDLALL QLRTP V> 
1270 1280 1290 1300 1310 1320 

AACCTGAGTGCGGCTTCGCGGCCCGTGTGCCTACCCCACCCGGAACACTACTTCCTGCCC 
TTGGACTCACGCCGAAGCGCCGGGCACACGGATGGGGTGGGCCTTGTGATGAAGGACGGG 
30 NLSAASRPVCLPHPEHYFL P> 

1330 1340 1350 1360 1370 1380 

GGGAGCCGCTGCCGCCTGGCCCGCTGGGGCCGCGGGGAACCCGCGCTTGGCCCAGGCGCG 
CCCTCGGCGACGGCGGACCGGGCGACCCCGGCGCCCCTTGGGCGCGAACCGGGTCCGCGC 
GSRCRLARWGRGEPALGPG A> 
35 1390 1400 1410 1420 1430 1440 

CTGCTGGAGGCGGAG CTGTTAGGCGG CTGGTGGTGCCACTGC CTGTACGGCCGC CAGGGG 
GACGACCT CCGC CTCGACAATCCGCCGACCACCACGGTGACGGACATG CCGG CGGTC CCC 
LLEAEIiLGGWWCHCLYGRQ G> 
1450 1460 1470 1480 1490 1500 

40 GCGGCAGTACCGCTGCCCGGAGACCCGCCGCACGCGCTCTGCCCTGCCTACCAG GAAAA G 

CGCCGTCATGGCGACGGGCCTCTGGGCGGCGTGCGCGAGACGGGACGGATGGTCCTTTTC 
AAVPLPGDPPHALC PAYQE K> 

1510 1520 1530 1540 1550 1560 

GAGGAGGTGGGCAGCTGCTGGAATGACTCGCGTTGGAGCCTTTTGTGCCAGGAGGAGGGG 
4 5 CTCCTCCACCCGTCGACGACCTTACTGAGCGCAACCTCGGAAAACACGGTCCTCCTCCCC 

EEVGSCWNDSRWSLLCQEE G> 

1570 1580 1590 1600 1610 1620 

ACCTGGTTTCTGGCTGGAATCAGAGACTTTCCCAGTGGCTGTCTACGTCCCCGAGCCTTC 
TGGACCAAAGACCGACCTTAGTCTCTGAAAGGGTCACCGACAGATGCAGGGGCTCGGAAG 
50 TWFLAGIRDFPSGCLRPRAF> 

1630 1640 1650 1660 1670 1680 

TTCCCTCTGCAGACTCATGGC C CATGGATCAGCCATGTGACTCGGGGAGCCTAC CTGGAG 
AAGGGAGACGTCTGAGTACCGGGTACCTAGTCGGTACACTGAGCCCCTCGGATGGACCTC 
FPLQTHGPWI SHVTRGAYL E> 
55 1690 1700 1710 1720 1730 1740 

GACCAGCTAGCCTGGGACTGGGGC C CTGATGGGGAGGAGACTGAGACACAGACTTGTCCC 
CTGGTCGATCGGACCCTGACCCCGGGACTACCCCTCCTCTGACTCTGTGTCTGAACAGGG 
DQ LAWDWGPDGEETETQTCP> 
1750 1760 1770 1780 1790 1800 

60 CCACACACAGAGCATGGTGCCTGTGGCCTGCGGCTGGAGGCTGCTCCAGTGGGGGTCCTG 

GGTGTGTGT CTCGTACCACGGACACCGGACGC CGACCTC CGACGAGGTCACC CCCAGGAC 
PHTEHGACGLRLEAAPVGV L> 

1810 1820 1830 1840 1850 1860 

TGGCCCTGGCTGGCAGAGGTGCATGTGGCTGGTGATCGAGTCTGCACTGGGATCCTCCTG 
65 AC CGGGACCGAC CGTCTCCACGTACAC CGAC CACTAGCTCAGACGTGACC CTAGGAGGAC 

WPWLAEVHVAGDRVCTGI 

1870 1880 1890 1900 1910 1920 

G C C C CAGGCTGGGTC CIX3GCAGC GACTCACTGTGT C CT CAGG C CAGG CTCT ACAACAGTG 
CGGGGTCCGACCCAGGACCGTCGGTGAGTGACACAGGAGTCCGGTCCGAGATGTTGTCAC 
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A PGWVLAATHCVLRPGSTT V> 
1930 1940 1950 1960 1970 1980 

C CTTACATTGAAGTGTATCTGGGCCGGGCAGGGGCCAGCTCC CTCCCACAGGGC CAC CAG 
GGAATGT AACTTCACATAGACC CGGC C CGTC CCCGGTCGAGGGAGGGTGTC CCGGTGGTC 
5 PY I EVYLGRAGASSLPQGH Q> 

1990 2000 2010 2020 2030 2040 

ATGACCTCAGCACCGCCCCTCCrGTGCCAGATGACGGAAGGGTCCTGGATCCTCGTGGGC 
TACTGGAGTCGTGGCGGGGAGGACACGGTCTACTGC CTTC C CAGGACCTAGGAGCACCCG 
MTSAPPLliCQMTEGSWILVG> 
10 2050 2060 2070 2080 2090 2100 

ATGGCTGTTCAAGGGAGCCGGGAGCTGTTTG CTGC CATTGGTCCTGAAGAGGC CTGGATC 
TACCGACAAGTTCCCTCGGCCCTCGACAAACGACGGTAACCAGGACTTCTCCGGACCTAG 
MAVQGSRELPAA1GPEEAWI> 
2110 2120 2130 2140 2150 2160 

1 5 TCCCAGACAGTGGGAGAGGCCAACTTCCTGCCCCCCAGTGGCTCCCCACACTGGCCCACT 

AGGGTCTGTCACCCTCTCCGGTTGAAGGACGGGGGGTCACCGAGGGGTGTGACCGGGTGA 
SQTVGEANFLPPSGSPHWP .T> 
2170 2180 2190 2200 2210 2220 

GGAGGCAGCAATCTCTGCCCCCCAGAACTGGCCAAGGCCTCGGGATCCCCGCATGCAGTC 
20 CCTCCGTCGTTAGAGACGGGGGGTCTTGACCGGTTCCGGAGCCCTAGGGGCGTACGTCAG 

GGSNLCPPELAKASGS PHA V> 
2230 2240 2250 2260 2270 2280 

TACTTCCTGCTCCTGCTGACTCTCCTGATCCAGAGCTGAGGGGCTAGGGTCCCAGCACCA 
ATGAAGGACGAGGACGACTGAGAGGACTAGGTCTCGACTCCCCGATCCCAGGGTCGTGGT 
25 YFLLLLTLL IQS *> 

2290 
CTTCCCCCTTCTC 
GAAGGGGGAAGAG 

30 CVSP1 6 polypeptide sequence 
Sequence Range: 1 to 753 



10 20 30 40 50 60 

MARQLLLPL WLVI S P I PGAFQDSAL S PTQEE PEDLDCGRPE P SARI VGG SNAQPGTWPW 
35 70 80 90 100 110 120 

QVS LHHGGGH I CGGS L IAP S WVLSAAH C FMTNGTLE PAAEWS VLLGVH S QDGPIjDGAHTR 
130 140 150 160 170 180 

AVAAIVVPANYSQVELGADLJUjLRIjASPASL 

190 200 210 220 230 240 

40 VQEADPLPIiPWVLQEVELRLLGEATCQC^YSQPGPFNIjTLQIIjPGMIiCAGYPGGRRI)TCQ 

250 260 270 280 290 300 

GDSGGPLVCEEGGRWFQAGITSPGFGCGPJUORPGVFTAVATYEAWIREQVMGSEPGPAFP 
310 320 330 340 350 360 

TQPQKTQSDPQEPREE^CTIALPECGKAPRPGAWPWEAQVMVPGSRPCHGALVSESWVIjA 
45 370 380 390 400 410 420 

PAS CFLDPNS SDS PPRDIiDAWRVLLPSHPRAERVARIiVQHENASWDNAPDLALIjQLRTPV 
430 440 450 460 470 480 

NIiSAASRPVCLPHPEHYFLPGSRCRLARWGRGEPALGPGAIiLEAELLGGWWCHCL.YGRQG 
490 500 510 520 530 540 

50 AAVPLPGDPPHAIiCPAYQEKEEVGSCWNDSRWSLLCQEEGTWFLAGIRDFPSGCIiRPRAF 

550 560 570 580 590 600 

FPLQTHGPW I SHVTRGAYIiEDQLAWDWGPDGE ETETQTCP PHTEHGACGLRLEAAPVGVL 
610 620 630 640 650 660 

WPWIAE VHVAGDRVCTGILliAPGWVIiAATHCVIiRPGSTTVPYI EVYLGRAGAS S L PQGHQ 
55 670 680 690 700 710 720 

MTSAPPLLCQMTEGSWILVOIAVQGSREIiFAAIGPEEAWISQTVGEANFLPPSGSPHWPT 

730 740 750 

GGSNLC P PELAKAS GS PHAVYFLLLLTLL I QS * 
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EXAMPLE 2 

Expression of the protease domains 

Nucleic acid encoding each a full length CVSP1 6 and/or protease domain 
thereof can be cloned into a derivative of the Pichia pastoris vector pPIC9K 
5 (available from Invitrogen; see SEQ ID NO. 13) or pPIC9KX (described below), 
which is introduced into a suitable Pichia host or other compatible host and used 
to express the encoded CVSP16 or portion thereof. 

Plasmid pPIC9K features include the 5' AOX1 promoter fragment at 
1-948; 5' AOX1 primer site at 855-875; alpha-factor secretion signal(s) at 
1 0 949-1 21 8; alpha-factor primer site at 1 1 52-1 1 72; multiple cloning site at 
1 192-1241; 3' AOX1 primer site at 1327-1347; 3' AOX1 transcription 
termination region at 1253-1586; HIS4 ORF at 4514-1980; kanamycin 
resistance gene at 5743-4928; 3' AOX1 fragment at 6122-6879; ColE1 origin 
at 7961-7288; and the ampicillin resistance gene at 8966-8106. The plasmid 
15 pPIC9KX is derived from pPIC9K by eliminating the Xhol site in the kanamycin 
resistance gene to produce pPIC9KX. 

Other vectors that can be used for expression of CVSP16 or portions 
thereof include, but are not limited to, insect and mammalian vectors as 
described, for example, above. The protein also can be expressed in E. coli, for 
20 example, as inclusion bodies in the cytoplasm or in the cytoplasm using the 

strain Origami (i.e., Origami B from Novagen, Madison Wl) that permits folding in 
the cytoplasm. CVSP16 also can be expressed in the periplasmic space. 

EXAMPLE 3 

Assays for identification of candidate compounds that modulate the activity of a 
25 serine protease 

Assay for identifying inhibitors 

The ability of test compounds to act as inhibitors of catalytic activity of a 
catalytic activity of a CVSP16 polypeptide can be assessed in an amidolytic 
assay. Compound mediated inhibition of amidolytic activity of a CVSP1 6 
30 polypeptide or a protease domain portion thereof, can be measured by IC50 
values in such an assay. 
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An exemplary assay buffer is HBSA (10 mM Hepes, 1 50mM sodium 
chloride, pH 7.4, 0.1% bovine serum albumin). All reagents can be purchased 
from Sigma Chemical Co. (St. Louis, MO), unless otherwise indicated. Two 
IC50 assays at 30-minute (a 30-minute preincubation of test compound and 
5 enzyme) and at 0-minutes (no preincubation of test compound and enzyme) are 
conducted. For the IC50 assay at 30-minute, the following reagents are 
combined in appropriate wells of a Corning microtiter plate: 50 microliters of 
HBSA, 50 microliters of the test compound, diluted (covering a broad 
concentration range) in HBSA (or HBSA alone for the uninhibited velocity 

10 measurement), and 50 microliters of the SP or protease domain thereof diluted in 
buffer, yielding a final enzyme concentration of about 0.5-5 nM. Following a 
30-minute incubation at ambient temperature, the assay is initiated by the 
addition of 50 microliters of a substrate for the particular SP (see, e.g., table and 
discussion below), which was reconstituted in deionized water, and diluted in 

1 5 HBSA prior to the assay, yielding a final volume of 200 microliters and a final 
substrate concentration of 200-600 juM. 

For an IC50 assay at 0-minute, the same reagents are combined: 50 
microliters of HBSA, 50 microliters of the test compound, diluted (covering the 
identical concentration range) in HBSA (or HBSA alone for uninhibited velocity 

20 measurement), and 50 microliters of the substrate, such as a chromogenic 
substrate. The assay is initiated by the addition of 50 microliters of SP. The 
final concentrations of ail components are identical in both IC50 assays (at 30- 
and 0-minute incubations). 

The initial velocity of the substrate hydrolysis is measured in both assays 

25 by, for example for a chromogenic substrate, the change in absorbance at a 
particular wavelength, using a Thermo Max® Kinetic Microplate Reader 
(Molecular Devices) over a 5 minute period, in which less than 5% of the added 
substrate was hydrolyzed. The concentration of added inhibitor, which caused a 
50% decrease in the initial rate of hydrolysis was defined as the respective IC50 

30 value in each of the two assays (30-and 0-minute). 
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Another assay for identifying inhibitors 

Test compounds for inhibition of the protease activity of the protease 
domain are assayed in Costar 96 well tissue culture plates (Corning NY). 
Approximately 0.5-5 nM of the CVSP16 or protease domain thereof is mixed 
5 with varying concentrations of inhibitor in 29.2 mM Tris, pH 8.4, 29.2 mM 

imidazole, 217 mM NaCI (100 mL final volume) and allowed to incubate at room 
temperature for 30 minutes. About 200-600 //M substrate is added, and the 
reaction monitored in a SpectraMAX® Plus microplate reader (Molecular Devices, 
Sunnyvale CA) by following the change in a parameter correlated with 
10 hydrolysis, such as absorbance for a chromogenic substrate for 1 hour at 37° C. 

Alternative assay for screening CVSP16 

The protease domain of CVSP16 or full-length polypeptide or other 
catalytically active portion thereof is expressed in Pichia pastoris. Test 

1 5 compounds are screened for modulation of the activity of the CVSP1 6 

polypeptide or portion thereof. Approximately 1-20 nM CVSP16 is mixed in 
Costar 96 well tissue culture plates (Corning NY) with varying concentrations of 
test compounds and/or known inhibitors or agonsists in 29.2 mM Tris, pH 8.4, 
29.2 mM Imidazole, 217 mM NaCI (100 jl/L final volume), and allowed to 

20 incubate at room temperature for 30 minutes. 200-600 //M of a chromogenic 
substrate is added, and the reaction is monitored in a SpectraMAX Plus 
microplate reader (Molecular Devices, Sunnyvale CA) by measuring the change 
in absorbance at 405 nm for 30 minutes at 37 °C. 
Identification of substrates 

25 Particular substrates for use in the assays can be identified empirically by 

testing substrates. The following list of substrates are exemplary of those that 
can be tested. 



Substrate name 


Structure 


S 2366 


pyroG I u-Pro-Arg-pN A . HCI 


spectrozyme t-PA 


CH 3 S0 2 -D-HHT-Gly-Arg-pNA.AcOH 


N-p-tosyl-Gly-Pro-Arg-pNA 


N-p-tosyl-Gly-Pro-Arg-pNA 


Benzoyl-Val-Gly-Arg-pNA 


Benzoy l-Vai-G ly- Arg-pN A 


Pefachrome t-PA 


CH 3 S0 2 -D-HHT-Gly-Arg-pNA 
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S 2765 


N-a-Z-D-Arg-Gly-Arg-pNA.2HCI 


S 2444 


pyroGlu-Gly-Arg-pNA.HCI 


S 2288 


H-D-lle-Pro-Arg-pNA.2HCI 


spectrozyme UK 


Cbo-L-(K)Glu(a-t-BuO)-Gly-Arg-pNA.2AcOH 


S 2302 


H-D-Pro-Phe-Arg-pNA.2HCI 


S 2266 


H-D-Val-Leu-Arg-pNA.2HCi 


S 2222 


Bz-lle-Glu(g-OR)-Gly-Arg-pNA.HCI 
R = H(50%) and R = CH 3 (50%) 


Chromozyme PK 


Benzoyl-Pro-Phe-Arg-pNA 


S 2238 


H-D-Phe-Pip-Arg-pNA.2HCI 


S 2251 


H-D-Val-Leu-Lys-pNA.2HCI 


Spectrozyme PI 


H-D-Nle-HHT-Lys-pNA.2AcOH 




Pyr-Arg-Thr-Lys-Arg-AMC 




H-Arg-Gln-Arg-Arg-AMC 




Boc-Gln-Gly-Arg-AMC 




Z-Arg-Arg-AMC 


Spectrozyme THE 


H-D-HHT-Ala-Arg-pNA.2AcOH 


Spectrozyme fXlla 


H-D-CHT-Gly-Arg-pNA.2AcOH 




CVS 2081-6 (MeS0 2 -dPhe-Pro-Arg-pNA) 




Pefachrome fVlla (CH 3 SO r D-CHA-But-Arg-pNA) 



20 pIMA = para-nitranilide (chromogenic) 

AMC = amino methyl coumarin (fluorescent) 

If none of the above substrates are cleaved, a coupled assay can be used. 

Briefly, such assays test the ability of the protease to activate an enzyme, such 

as plasminogen and trypsinogen. To perform these assays, the single chain 

25 protease is incubated with a zymogen, such as plasminogen or trypsinogen, in 

the presence of a known substrate for plasmin or trypsin, such as a 

Spectrozyme substrate. If a single chain CVSP16 activates the zymogen, the 

activated enzyme, such as plasmin and trypsin, will degrade the substrate 

therefor. 

30 EXAMPLE 4 

Other Assays 

These assays are described with reference to MTSP1, but such assays 
can be readily adapted for use with CVSP16. 

Amidolytic Assay for Determining Inhibition of Serine Protease 
35 Activity of Matriptase or MTSP1 
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The ability of test compounds to act as inhibitors of rMAP catalytic 
activity was assessed by determining the inhibitor-induced inhibition of 
amidolytic activity by the MAP, as measured by IC S0 values. The assay buffer 
was HBSA (10 mM Hepes, 1 50mM sodium chloride, pH 7.4, 0.1% bovine serum 
5 albumin). All reagents were from Sigma Chemical Co. {St. Louis, MO), unless 

otherwise indicated. 

Two IC 50 assays (a) one at either 30-minutes or 60-minutes (a 30-rninute 
or a 60-minute preincubation of test compound and enzyme) and (b) one at 
O-minutes (no preincubation of test compound and enzyme) were conducted. 

10 For the IC S0 assay at either 30-minutes or 60-minutes, the following reagents 
were combined in appropriate wells of a Corning microtiter plate: 50 microliters 
of HBSA, 50 microliters of the test compound, diluted (covering a broad 
concentration range) in HBSA (or HBSA alone for uninhibited velocity 
measurement), and 50 microliters of the rMAP (Corvas International) diluted in 

1 5 buffer, yielding a final enzyme concentration of 250 pM as determined by active 
site titration. Following either a 30-minute or a 60-minute incubation at ambient 
temperature, the assay was initiated by the addition of 50 microliters of the 
substrate S-2765 (N-a-Benzyloxycarbonyl-D-arginyl-L-glycyl-L-arginine-p- 
nitroaniline dihydrochloride; DiaPharma Group, Inc.; Franklin, OH) to each well, 

20 yielding a final assay volume of 200 microliters and a final substrate 

concentration of 100 jjM (about 4-times K m ). Before addition to the assay 
mixture, S-2765 was reconstituted in deionized water and diluted in HBSA. For 
the IC 50 assay at 0 minutes; the same reagents were combined: 50 microliters of 
HBSA, 50 microliters of the test compound, diluted, (covering the identical 

25 concentration range) in HBSA (or HBSA alone for uninhibited velocity 

measurement), and 50 microliters of the substrate S-2765. The assay was 
initiated by the addition of 50 microliters of rMAP. The final concentrations of 
all components were identical in both IC 50 assays (at 30- or 60- and 0-minute). 
The initial velocity of chromogenic substrate hydrolysis was measured in 

30 both assays by the change of absorbance at 405 nM using a Thermo Max® 

Kinetic Microplate Reader (Molecular Devices) over a 5 minute period, in which 
less than 5% of the added substrate was used. The concentration of added 
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inhibitor, which caused a 50% decrease in the initial rate of hydrolysis was 
defined as the respective IC 50 value in each of the two assays (30- or 
60-minutes and 0-minute). 

in vttro enzyme assays for specificity determination 
5 The ability of compounds to act as a selective inhibitor of matriptase 

activity was assessed by determining the concentration of test compound that 
inhibits the activity of matriptase by 50%, (IC 50 ) as described in the above 
Example, and comparing IC 50 value for matriptase to that determined for all or 
some of the following serine proteases: thrombin, recombinant tissue 
10 plasminogen activator (rt-PA), plasmin, activated protein C, chymotrypsin and 
factor Xa. 

The buffer used for all assays was HBSA (10 mM HEPES, pH 7.5, 150 
mM sodium chloride, 0.1% bovine serum albumin). The assay for IC 50 
determinations was conducted by combining in appropriate wells of a Corning 

1 5 microtiter plate, 50 microliters of HBSA, 50 microliters of the test compound at 
a specified concentration (covering a broad concentration range) diluted in HBSA 
(or HBSA alone for V 0 (uninhibited velocity) measurement), and 50 microliters of 
the enzyme diluted in HBSA. Following a 30 minute incubation at ambient 
temperature, 50 microliters of the substrate at the concentrations specified 

20 below were added to the wells, yielding a final total volume of 200 microliters. 
The initial velocity of chromogenic substrate hydrolysis was measured by the 
change in absorbance at 405 nm using a Thermo Max® Kinetic Microplate Reader 
over a 5 minute period in which less than 5% of the added substrate was used. 
The concentration of added inhibitor which caused a 50% decrease in the initial 

25 rate of hydrolysis was defined as the IC S0 value. 

Thrombin (flla) Assay 
Enzyme activity was determined using the chromogenic substrate, 
Pefachrome t-PA (CHaSOj-D-hexahydrotyrosine-glycyl-L-Arginine-p-nitroaniline, 
obtained from Pentapharm Ltd.). The substrate was reconstituted in deionized 

30 water prior to use. Purified human a-thrombin was obtained from Enzyme 

Research Laboratories, Inc. The buffer used for all assays was HBSA (10 mM 
HEPES, pH 7.5, 150 mM sodium chloride, 0.1% bovine serum albumin). 
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IC 50 determinations were conducted where HBSA (50 jjL), a-thrombin (50 
fj\) (the final enzyme concentration is 0.5 nM) and inhibitor (50 (covering a 
broad concentration range), were combined in appropriate wells and incubated 
for 0 or 30 minutes at room temperature prior to the addition of substrate 
5 Pefachrome-t-PA (50 //I) (the final substrate concentration is 250 //M, about 5 
times Km). The initial velocity of Pefachrome t-PA hydrolysis was measured by 
the change in absorbance at 405 nm using a Thermo Max® Kinetic Microplate 
Reader over a 5 minute period in which less than 5% of the added substrate was 
used. The concentration of added inhibitor which caused a 50% decrease in the 
10 initial rate of hydrolysis was defined as the IC 50 value. 

Factor Xa 

Factor Xa catalytic activity was determined using the chromogenic 
substrate S-2765 (N-benzyloxycarbonyl-D-arginine-L-glycine-L-arginine-p-nitro- 
aniline), obtained from DiaPharma Group (Franklin, OH). All substrates were 

15 reconstituted in deionized water prior to use. The final concentration of S-2765 
was 250 //M (about 5-times Km). Purified human Factor X was obtained from 
Enzyme Research Laboratories, Inc. (South Bend, IN) and Factor Xa (FXa) was 
activated and prepared from it as described (Bock, P.E., Craig, P.A., Olson, ST., 
and Singh, P. Arch. Biochem. Biophys. 273:375-388 (1989)) The enzyme was 

20 diluted into HBSA prior to the assay were the final concentration was 0.25 nM. 
Recombinant tissue plasminogen activator (rt-PA) Assay 
rt-PA catalytic activity was determined using the substrate, Pefachrome 
t-PA (CH 3 S0 2 -D-hexahydrotyrosine-glycyl-L-arginine-p-nitroaniIine, obtained from 
Pentapharm Ltd.). The substrate was made up in deionized water followed by 

25 dilution in HBSA prior to the assay where the final concentration was 500 
micromolar (about 3-times Km). Human rt-PA (Activase®) was obtained from 
Genentech Inc. The enzyme was reconstituted in deionized water and diluted 
into HBSA prior to the assay where the final concentration was 1 .0 nM. 
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Plasmin Assay 

Plasmin catalytic activity was determined using the chromogenic 
substrate, S-2366 [L-pyroglutamyl-L-prolyl-L-arginine-p-nitroaniline 
hydrochloride], which was obtained from DiaPharma group. The substrate was 
5 made up in deionized water followed by dilution in HBSA prior to the assay in 
which the final concentration was 300 micromolar (about 2.5-times Km). 
Purified human plasmin was obtained from Enzyme Research Laboratories, Inc. 
The enzyme was diluted into HBSA prior to the assay where final concentration 
was 1 .0 nM. 

1 0 Activated Protein C (aPC) Assay 

aPC catalytic activity was determined using the chromogenic substrate, 
Pefachrome PC (delta-carbobenzloxy-D-lysine-L-proIyl-L-arginine-p-nitroaniline 
dihydrochloride), obtained from Pentapharm Ltd.). The substrate was made up 
in deionized water followed by dilution in HBSA prior to the assay where the 
15 final concentration was 400 micromolar (about 3-times Km). Purified human 

aPC was obtained from Hematologic Technologies, Inc. The enzyme was diluted 
into HBSA prior to the assay where the final concentration was 1.0 nM. 

Chymotrypsin Assay 
Chymotrypsin catalytic activity was determined using the chromogenic 
20 substrate, S-2586 (methoxy-succinyl-L-arginine-L-prolyl-L-tyrosyl-p-nitroanilide), 
which was obtained from DiaPharma Group. The substrate was made up in 
deionized water followed by dilution in HBSA prior to the assay where the final 
concentration was 100 micromolar (about 9-times Km). Purified (3X-crystallized; 
CDI) bovine pancreatic alpha-chymotrypsin was obtained from Worthington 
25 Biochemical Corp. The enzyme was reconstituted in deionized water and diluted 
into HBSA prior to the assay where the final concentration was 0.5 nM. 

Since modifications will be apparent to those of skill in this art, it is 
intended that this invention be limited only by the scope of the appended claims. 

30 
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WHAT IS CLAIMED IS: 

1 . A substantially purified single chain or multi-chain polypeptide, 
comprising at least two protease domains of a serine protease 16 (CVSP16), 
wherein the polypeptide comprises at least 5 contiguous amino acids 

5 corresponding to residues 508-544 of SEQ ID No. 6. or comprises the 

contiguous sequence Asn Asp Ser or Trp Asn Asp or Ser Cys Trp Asn Asp Ser 
or Cys Trp Asn Asp Ser or Gin Thr His or Leu Gin Thr His in the second protease 
domain. 

2. The polypeptide of claim 1, wherein one protease domain 
10 comprises amino acids 323-550 or 326-550 of SEQ ID No. 6. 

or has at least about about 60%, 70%, 80%, 90% or 95% sequence identity to 
amino acids 326-550 of SEQ ID No. 6. 

3. The polypeptide of claim 1, wherein one protease domain 
comprises amino acids 46-286 of SEQ ID No. 6 or as has at least about about 

15 60%, 70%, 80%, 90% or 95% sequence identity to amino acids 47-286 of SEQ 
ID No. 6. 

4. A substantially purified single chain or multi-chain polypeptide, 
comprising a protease domain of a serine protease 16 (CVSP16) or a functionally 
active portion thereof or a domain thereof, wherein: 

20 if the polypeptide includes residues that correspond to Gln^ and Met 66n , 

it does not include at least 5 contiguous amino acids from SEQ ID No. 21 
inserted between residues that correspond to Gln 660 and M 661 of SEQ ID No. 21 . 

5. A polypeptide of claim 1 or claim 4 that contains two or three 

chains. 

25 6. A polypeptide of claim 1 that has catalytic activity. 

7. A polypeptide of claim 4 or claim 6 that comprises one protease 
domain. 

8. A polypeptide of claim 4 or claim 6 that comprises two protease 
domains. 

30 9. A polypeptide of claim 4 or claim 6, wherein a protease domain 

comprises amino acids 46-286 or 326-550 of SEQ ID No. 6 or amino acids that 
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share at least about 60%, 70%, 80% f 90% or 95% homology to amino acids 
46-286 or 326-550 of SEQ ID No. 6. 

10. A polypeptide of claim 4 or claim 6, wherein the contiguous 
sequence is not present in the polypeptide at any locus. 
5 11 . A polypeptide of claim 4 or claim 6, wherein the polypeptide 

comprises the contiguous sequence Gly His Gin Met Thr Ser (SEQ ID No. 6, 
amino acids 658-663). 

12. A polypeptide of claim 8, that comprises amino acids 46-286 and 
326-550 of SEQ ID No. 6 or amino acids that share at least about 60%, 70%, 

10 80%, 90% or 95% sequence identity to each of amino acids 46-286 and and 
amino acids 326-550 of SEQ ID No. 6. 

13. A polypeptide of claim 4 or claim 6, wherein the CVSP16 portion 
of the polypeptide consists essentially of amino acids 46-286 of SEQ ID No. 6. 

14. A polypeptide of claim 4 or claim 6, wherein the CVSP16 portion 
15 consists essentially of amino acids 323-550 or 326-550 of SEQ ID No. 6. 

15. A polypeptide of claim 13, wherein the CVSP16 portion of the 
polypeptide has at least about about 60%, 70%, 80%, 90% or 95% sequence 
identity to amino acids 46-286 of SEQ ID No. 6. 

16. A polypeptide of claim 14, wherein the CVSP16 portion of the 
20 polypeptide has at least about about 60%, 70%, 80%, 90% or 95% sequence 

identity to amino acids 46-286 of SEQ ID No. 6. 

17. A polypeptide of claim 4 or claim 6, wherein the CVSP16 portion 
of the polypeptide consists essentially of amino acids 46-550 of SEQ ID No. 6. 

18. The polypeptide of claim 4 or claim 6, wherein: 

25 the CVSP16 portion of the polypeptide consists essentially of a 

protease domain of a CVSP16 or a catalytically active portion thereof. 

19. A polypeptide of claim 4 or claim 6, comprising the sequence of 
amino acids set forth in SEQ ID No. 6 or set forth as amino acids 24-752 in SEQ 
ID No. 6. 

30 20. A polypeptide of claim 4 or claim 6, consisting essentially of the 

sequence of amino acids set forth in SEQ ID No. 6 or consisting essentially of 
amino acids 24-752 or SEQ ID No. 6. 
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21 . A polypeptide of claim 1 or claim 4 or claim 6, wherein the 
CVSP16 is a human protein. 

22. A polypeptide of claim 1 or claim 4, wherein the level of 
expression and/or activity of the CVSP16 in tumor cells differs from its level of 

5 expression and/or activity in non-tumor cells. 

23. A polypeptide of claim 1 or claim 4, wherein the CVSP16 
polypeptide is detectable in a body fluid at a level that differs from its level in 
body fluids in a subject not having a tumor. 

24. A polypeptide of claim 1 or claim 4 or claim 6 that is a single 

10 chain. 

25. A polypeptide of claim 6 that is a two or three chain polypeptide. 

26. A polypeptide of claim 1 or claim 4, wherein: 
the CVSP1 6 is present in a tumor; and 

a substrate or cofactor for the CVSP16 is expressed at levels that differ 
1 5 from its level of expression in a non-tumor cell in the same type of tissue. 

27. A polypeptide of claim 1 or claim 4 that has at least about 60%, 
80%, 90% or 95% sequence identity with a polypeptide that comprises the 
sequence of amino acids set forth as SEQ ID No. 6 or a catalytically active 
portion thereof. 

20 28. A polypeptide of claim 1 or claim 4 that comprises a protease 

domain encoded by a nucleic acid molecule selected from the group consisting 
of: 

a) a nucleic acid molecule that hybridizes under conditions of high 
stringency along at least 70% of its full length to a nucleic acid molecule 

25 comprising a sequence of nucleotides set forth in SEQ ID No. 5 that encodes 
amino acids 46 to 285 or 326 to 550 of SEQ ID No. 6; 

b) a nucleic acid molecule molecule, comprising the sequence of 
nucleotides set forth in SEQ ID No. 5 that encodes resides 24-752; and 

c) a nucleic acid molecule that comprises degenerate codons of a) or b). 
30 29. A polypeptide of claim 4 that is selected from the group consisting 

of: 
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a polypeptide encoded by the sequence of nucleotides set forth in 
SEQ ID No. 5 or a catalytically active portion or ligand or substrate binding 
portion of the polypeptide; 

a polypeptide encoded by a sequence of nucleotides that 
5 hybridizes under conditions of high stringency along 70% of its full length to the 
sequence of nucleotides set forth in SEQ ID No. 5 or to a sequence of 
nucleotides comprising degenerate codons thereof; 

a polypeptide that comprises a sequence of amino acids having at 
least about 85%, 86%, 88%, 90%, 93% or 95% sequence identity with the 
10 sequence of amino acids set forth in SEQ ID No. 6; and 

a polypeptide encoded by a splice variant of the sequence of 
nucleotides set forth in SEQ ID No. 5. 

30. A polypeptide that is a mutein of the polypeptide of claim 1 claim 
4 or claim 6, wherein: 
15 up to about 50% of the amino acids are replaced with another amino 

acid; and 

the resulting polypeptide is a single chain two-chain or three-chain 
polypeptide that has catalytic activity of at least 1 % of the unmutated 
polypeptide. 

20 31 . A polypeptide of claim 30, wherein up to about 25% of the amino 

acids are replaced with another amino acid. 

32. A polypeptide of claim 30, wherein up to about 10% of the amino 
acids are replaced with another amino acid. 

33. A polypeptide of claim 30, wherein the resulting polypeptide is a 
25 single chain or two-chain or three-chain polypeptide and has catalytic activity of 

at least 10% of the unmutated polypeptide. 

34. A polypeptide of claim 32, wherein the resulting polypeptide is a 
single chain or two-chain or three-chain polypeptide and has catalytic activity of 
at least 10% of the unmutated polypeptide. 

30 35. A polypeptide of claim 30, wherein up to about 95% of the amino 

acids are conserved or are replaced by conservative amino acid substitutions. 
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36. A polypeptide of claim 4 or claim 6, wherein an unpaired Cysteine 
in a protease domain is replaced with another amino acid. 

37. The polypeptide of claim 36, wherein the replacing amino acid is a 

serine. 

5 38. A polypeptide of claim 36, wherein the unpaired Cys in a protease 

domain is amino acid C 159 and/or C^. 

39. A nucleic acid molecule, comprising a sequence of nucleotides that 
encodes a polypeptide of claim 1 or claim 4 or claim 6 or claim 26. 

40. A plasmid or vector comprising the nucleic acid molecule of claim 

10 39. 

41 . A vector of claim 40 that is an expression vector. 

42. A vector of claim 41 that includes a sequence of nucleotides that 
directs secretion of any protein encoded by a sequence of nucleotides 
operatively linked thereto. 

15 43. A vector of claim 41 that is a Pichia vector, a baculovirus vector, 

an mammalian cell vector or an E. coli vector. 

44. A cell, comprising a plasmid or vector of claim 40. 

45. The cell of claim 44 that is a prokaryotic cell. 

46. The cells of claim 44 that is a eukaryotic cell. 

20 47. The cell of claim 44 that is selected from among a bacterial cell, a 

yeast cell, a yeast cell, a plant cell, an insect cell and an animal cell. 

48. The cell of claim 47 that is a mammalian cell. 

49. A method for producing a polypeptide that contains a protease 
domain of a CVSP1 6, comprising: 

25 culturing a cell of claim 44 under conditions whereby the encoded 

protein is expressed by the cell; and 

recovering the expressed protein. 

50. The method of claim 49, wherein the cell is a Pichia cell and and 
the protein is optionally secreted into the culture medium or the cell is a 

30 mammalian cell. 

51 . The method of claim 49, wherein the polypeptide is secreted into 

the culture medium. 
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52. The method of claim 49, wherein the polypeptide is expressed in 

the cytoplasm of the host cell. 

53. The method of claim 49, wherein the polypeptide is expressed in 
inclusion bodies, and the method further comprises 

5 isolating the polypeptide from the inclusion bodies under conditions, 

whereby the polypeptide refolds into a proteolytically active form. 

54. An antisense nucleic acid molecule that: 

comprises at least 14 and less than about 150 contiguous nucleotides or 
modified nucleotides that are complementary to a contiguous sequence of 
10 nucleotides of a CVSP1 6 of claim 4; 

comprises at least 1 6 and less than about 1 50 contiguous nucleotides or 
modified nucleotides that are complementary to a contiguous sequence of 
nucleotides of a CVSP1 6 of claim 4; 

comprises at least 30 and less than about 1 50 contiguous nucleotides or 
1 5 modified nucleotides that are complementary to a contiguous sequence of 
nucleotides of a CVSP1 6 of claim 4, 

wherein the contiguous nucleotides span nucleotides corresponding to 

nucleotides 1978-1983 of SEQ ID No. 5. 

55. A double-stranded RNA (dsRNA) molecule that comprises at least 
20 about 21 contiguous nucleotides or modified nucleotides that are complementary 

to all or a portion of a contiguous sequence of nucleotides that encodes the 
sequence of amino set forth as SEQ ID No. 6. 

56. The dsRNA of claim 55, wherein the contiguous nucleotides span 
nucleotides corresponding to nucleotides 1978-1983 of SEQ ID No. 5. 

25 57. An antibody that binds to a polypeptide of claim 4 with at least 

10-fold greater affinity than to a polypeptide that includes the at least 5 
contiguous amino acids set forth in SEQ ID No. 21 . 

58. An antibody that binds to a polypeptide of claim 4 with at least 2- 
fold greater affinity than to a polypeptide that includes the at least 5 contiguous 

30 amino acids set forth in SEQ ID No. 21 . 
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59. The antibody of claim 58, wherein the contiguous sequence is 
inserted between amino acids corresponding to Q660 and M661 of a CVSP16 
polypeptide that comprises amino acids 24-752 of SEQ ID No. 6. 

60. An antibody of claim 57 that binds with at least 100-fold greater 
5 affinity. 

61 . An antibody of claim 58 that inhibits an catalytic activity of the 
polypeptide. 

62. An antibody of claim 58 that inhibits an a ligand or substrate 
binding activity of the polypeptide. 

10 63. An antibody that specifically binds to a single-chain form of a 

protease domain 1 (PD1) of a CVSP16 polypeptide or to a single-chain form of a 
protease domain 2 (PD2) of a CVSP16 polypeptide 

64. An antibody of the specifically that binds to a single-chain form of 
a CVSP16 polypeptide of claim 4. 

15 65. A conjugate, comprising: 

a) a CVSP1 6 polypeptide; and 

b) a targeting agent linked to the protein directly or via a linker. 

66. A combination, comprising: 

a) a modulator of the catalytic activity or substate or ligand binding 
20 activity of a CVSP1 6 polypeptide; and 

b) another treatment agent or agent selected from anti-tumor and 
anti-angiogenic treatments or agents. 

67. The combination of claim 66, wherein the modulator is an 
inhibitor. 

25 68. The combination of claim 67, wherein the inhibited activity is 

catalytic activity. 

69. The combination of claim 68, wherein the modulator inhibitor and 
the anti-tumor and/or anti-angiogenic agent are formulated in a single 
pharmaceutical composition or each is formulated in separate pharmaceutical 

30 compositions. 

70. The combination of claim 69, wherein the inhibitor is selected from 
antibodies and antisense oligonucleotides. 
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71 . A solid support comprising two or more CVSP1 6 polypeptides of 
claim 4 linked thereto either directly or via a linker. 

72. The support of claim 71, wherein the polypeptides comprise an 

array. 

5 73. The support of claim 71, further comprising a plurality of different 

serine protease domains linked to the support directly or via a linker. 

74. A method for identifying compounds that modulate the protease 
activity of a CVSP1 6 polypeptide, comprising: 

contacting a polypeptide of claim 1 or claim 4 with a substrate 
10 proteolytically cleaved by the CVSP16 polypeptide, and, either simultaneously, 
before or after, adding a test compound or plurality thereof; 

measuring the amount of substrate cleaved in the presence of the test 
compound; and 

selecting compounds that change the amount cleaved compared to a 
1 5 control, whereby compounds that modulate an activity of the CVSP1 6 are 
identified. 

75. The method of claim 74, wherein the test compounds are small 
molecules, peptides, peptidomimetics, natural products, antibodies or fragments 
thereof. 

20 76. The method of claim 74, wherein a plurality of the test substances 

are screened simultaneously. 

77. The method of claim 74, wherein the change in the amount of 
substrate cleaved is assessed by comparing the amount cleaved in the presence 
of the test compound with the amount cleaved in the absence of the test 

25 compound. 

78. The method of claim 74, wherein the polypeptides comprise an 

array. 

79. The method of claim 74, wherein the polypeptides comprise a 
plurality of different serine proteases. 

30 80. A method of identifying a compound that specifically binds to a 

form of a CVSP16, comprising: 
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contacting a CVSP16 polypeptide of claim 4, or a functionally 
active portion thereof, with a test compound or plurality thereof under 
conditions conducive to binding thereof, and either: 

a) identifying test compounds that specifically bind to a form, or to 
5 a functionally active portion thereof; or 

b) identifying test compounds that inhibit binding of a compound 
known to bind to a form of the polypeptide or to a functionally active portion 
thereof, wherein: 

the known compound is contacted with the polypeptide either before, 
1 0 simultaneously with, or after the test compound; 

a functionally active portion is a proteolytically active portion and/or a 
substrate or ligand binding portion; and 

a form is one or more of a single chain form, a two-chain form, a three 
chain form and/or a four chain form and the form is activated or is a zymogen or 
1 5 includes one or more activated domains. 

81. The method of claim 80, wherein the polypeptide is linked either 
directly, or indirectly, via a linker, to a solid support. 

82. The method of claim 80, wherein the test compounds are small 
molecules, peptides, peptidomimetics, natural products, antibodies or fragments 

20 thereof. 

83. The method of claim 80, wherein a plurality of the test substances 
are screened simultaneously. 

84. The method of claim 83, wherein a plurality of the polypeptides 
are linked to a solid support. 

25 85. A method for Identifying activators of a zymogen form of a 

CVSP16 or functionally active thereof, comprising: 

contacting a zymogen form of a CVSP1 6 polypeptide of claim 1 , 
or a functionally active portion thereof, with a substrate of the activated form of 
the polypeptide; 

30 adding a test compound, wherein the test compound is added 

before, after, or simultaneously with, the addition of the substrate; and 
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detecting cleavage of the substrate, thereby identifying 
compounds that activate the zymogen, wherein: 

a functionally active portion is a proteolyticaliy active portion and/or a 
substrate or ligand binding portion; and 
5 a zymogen is one or more of a single chain form, a two-chain form or a 

three chain form that includes at least one domain that is not activated. 

86. The method of claim 85, wherein the substrate is a chromogenic 
or fluorogenic substrate. 

87. The method of claim 85, wherein the test compounds are small 
10 molecules, peptides, peptidomimetics, natural products, antibodies or fragments 

thereof. 

88. A method for treating or preventing a neoplastic disease in a 
mammal, comprising administering to a mammal an effective amount of a 
modulator of a polypeptide of claim 4. 

15 89. The method of claim 88, wherein the modulator is an inhibitor. 

90. The method of claim 88, wherein the modulator is an antibody that 
specifically binds to the polypeptide, or a fragment or derivative of the antibody 
containing a binding domain thereof, wherein the antibody is a polyclonal 
antibody or a monoclonal antibody. 

20 91. A method of inhibiting tumor initiation, growth, progression, or 

treatment of a malignant or pre-malignant condition, comprising administering an 
agent that modulates activation cleavage of the zymogen form of a CVSP1 6 
polypeptide of claim 4 or a potentially functionally active portion thereof, or 
inhibits an activity of the activated form of CVSP16, or a proteolyticaliy active 

25 portion thereof, wherein a functionally active portion is a proteolyticaliy active 
portion and/or a substrate or ligand binding portion. 

92. The method claim 91, wherein the agent inhibits cleavage. 

93. The method of claim 91, wherein the condition is a tumor or 
cancer of the uterus, breast, colon, lung, kidney, rectum, prostate, cervix, 

30 testes, stomach, esophagus, ovary, or small intestine, or is a leukemia or a 
lymphoma. 
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94. The method of claim 91, wherein the agent is an antisense 
oligonucleotide, double-stranded RNA (dsrna) or an antibody. 

95. The method of claim 91, further comprising administering another 
treatment or agent selected from anti-tumor and anti-angiogenic treatments or 

5 agents. 

96. A method of identifying a compound that binds to one or more 
forms of a CVSP16 polypeptide of claim 4, and/or to a functionally active portion 
thereof comprising: 

contacting a test compound with two or more forms of a CVSP1 6 
10 polypeptide of claim 4; 

and determining to which form or forms the compound binds; and 
if it binds to a form of a CVSP16 polypeptide, further 
determining whether the compound has at least one of the following 
properties: 

1 5 (i) inhibits activation cleavage of a zymogen form of polypeptide; 

(ii) inhibits activity of a form; and 
(Hi) inhibits dimerization of the polypeptide, wherein: 
a functionally active portion is a catalytically active portion and/or a 
substrate or ligand binding portion; and 
20 a form is one or more of a single chain form, a two-chain form, a three 

chain form and/or a four chain form and the form is activated or is a zymogen or 
includes one or more activated domains. 

97. A method of detecting neoplastic disease, comprising: detecting 
a polypeptide that comprises a polypeptide of claim 4 in a biological sample, 

25 wherein the amount, form, and/or activity detected differs from the amount, 
form, and/or activity of the polypeptide detected from a subject without 
neoplastic disease. 

98. The method of claim 97, wherein the biological sample is selected 
from the group consisting of blood, urine, saliva, tears, synovial fluid, sweat, 

30 interstitial fluid, sperm, cerebrospinal fluid, ascites fluid, and/or tumor tissue 
biopsy and circulating tumor cells. 
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99. The method of claim 96, wherein the biological sample is selected 
from the group consisting of blood, urine, saliva, tears, synovial fluid, sweat, 
interstitial fluid, cerebrospinal fluid, semen, ascites fluid, tumor tissue biopsy and 
circulating tumor cells. 
5 100. The method of claim 97, wherein the one or more forms of the 

CVSP16 polypeptide consist(s) essentially of a protease domain. 

101. A method of diagnosing the presence of a pre-malignant lesion, 
malignancy, or other pathologic condition in a subject, comprising: 

obtaining a biological sample from the subject; 
10 exposing it to an agent that binds to one or more forms of a CVSP16 

polypeptide or inhibits or potentiates an activity of the polypeptide; and 

detecting binding and/or a change in the activity, 
wherein: 

the pathological condition is characterized by the presence, excess 
15 or absence of a three-chain, two-chain and/or single-chain form; and 

detection of binding and/or a change in the activity is indicative of 
the pathological condition in the subject. 

102. The method of claim 101, wherein an activity is inhibited. 

103. The method of claim 101, wherein the agent is an antibody that 
20 specifically binds to a CVSP1 6 polypeptide. 

104. The method of claim 101, wherein the sample is bodily fluid 
selected from blood, urine, sweat, saliva, cerebrospinal fluid or synovial fluid. 

105. A method of monitoring tumor progression and/or therapeutic 
efficacy, comprising detecting and/or quantifying the level, form, and/or activity 

25 of a CVSP1 6 polypeptide in a bodily tissue or fluid sample. 

106. The method of claim 105, wherein the tumor is a tumor of the 
uterus, breast, colon, lung, kidney, rectum, prostate, cervix, testes, stomach, 
esophagus, ovary, or small intestine, or is a leukemia or a lymphoma 

107. The method of claim 105, wherein the bodily fluid is blood, urine, 
30 sweat, saliva, cerebrospinal fluid or synovial fluid. 

108. A method of inhibiting tumor invasion or metastasis or treating a 
malignant or pre-malignant condition, comprising administering an agent that 
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inhibits activation of the zymogen form of CVSP16 or an activity of an activated 
form. 

109. The method of claim 108, wherein the condition is a condition of 
the uterus, breast, colon, lung, kidney, rectum, prostate, cervix, testes, 

5 stomach, esophagus, ovary, or small intestine, or is a leukemia or a lymphoma. 

1 10. The method of claim 108, further comprising administering another 
treatment or agent selected from anti-tumor and anti-angiogenic treatments or 
agents. 

111. The method of claim 108, wherein the agent is an antisense 
10 oligonucleotide or an antibody. 

112. A signal sequence, consisting essentially of amino acids 1 -23 of 

SEQ ID No. 6. 

113. A pro-polypeptide, comprising the signal sequence of claim 107, 
wherein the signal sequence is heterologous to a polypeptide operatively linked 

1 5 thereto. 

114. A polypeptide, comprising a portion of a CVSP16 polypeptide, 
wherein the portion consists essentially of amino acids 1-23 of SEQ ID No. 6. 

115. A computational method for screening compounds, comprising: 
assessing the interaction of a test compound with a computer- 

20 simulated polypeptide that has the sequence of amino acids of a polypeptide of 
any of claims 1 , 4 and 6; and 

identifying test compounds that interact with the polypeptide, 
wherein assessment is effected in silico. 

116. A recombinant non-human animal, wherein an endogenous gene 
25 that encodes a polypeptide of claim 4 has been deleted or inactivated by 

homologous recombination or insertional mutagenesis of the animal or an 
ancestor thereof. 

117. A transgenic non-human, comprising heterologous nucleic acid 
that enodes a polypeptide of claim. 

30 
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118. The conjugate of claim 117, wherein the targeting agent permits 

i) affinity isolation or purification of the conjugate; 

ii) attachment of the conjugate to a surface; 

iii) detection of the conjugate; or 

iv) targeted delivery to a selected tissue or cell. 
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SEQUENCE LISTING 

<110> Corvas International, Inc. 
Madison, Edwin 
Ong, Edgar 
Yen, Juinn-Chern 

<120> NUCLEIC ACID MOLECULES ENCODING SERINE PROTEASE 16, THE 
ENCODED PROTEINS AND METHODS BASED THEREON 

<130> 24745-1625 

<140> Not Yet Assigned 
<141> Herewith 

<150> 60/394,347 
<151> 02-JUL-02 

<160> 22 

<170> FastSEQ for Windows Version 4.0 

<210> 1 

<211> 3147 

<212> DNA 

<213> Homo Sapien 

<220> 
<221> CDS 

<222> (23) . . . (2589) 

<223> Nucleotide sequence encoding MTSP1 
<300> 

<308> GenBank #AR081724 
<309> 2000-08-31 

<400> 1 

tcaagagcgg cctcggggta cc atg ggg age gat egg gec cgc aag ggc gga 52 

Met Gly Ser Asp Arg Ala Arg Lys Gly Gly 
1 5 10 

ggc gcg gga etc aag tac aac tec egg cac 100 
Gly Ala Gly Leu Lys Tyr Asn Ser Arg His 

20 25 

gag gaa ggc gtg gag ttc ctg cca gtc aac 148 
Glu Glu Gly Val Glu Phe Leu Pro Val Asn 
35 40 

aag cat ggc ccg ggg cgc tgg gtg gtg ctg 196 
Lys His Gly Pro Gly Arg Trp Val Val Leu 
50 " 55 

etc etc ttg gtc ttg ctg ggg ate ggc ttc 244 
Leu Leu Leu Val Leu Leu Gly lie Gly Phe 
65 70 

tac egg gac gtg cgt gtc cag aag gtc ttc 292 
Tyr Arg Asp Val Arg Val Gin Lys Val Phe 



ggg ggc ccg aag gac ttc 
Gly Gly Pro Lys Asp Phe 

15 

gag aaa gtg aat ggc ttg 
Glu Lys Val Asn Gly Leu 

30 

aac gtc aag aag gtg gaa 
Asn Val Lys Lys Val Glu 
45 

gca gec gtg ctg ate ggc 
Ala Ala Val Leu lie Gly 
60 

ctg gtg tgg cat ttg cag 
Leu Val Trp His Leu Gin 
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75 80 85 90 

aat ggc tac atg agg ate aca aat gag aat ttt gtg gat gec tac gag 34 0 
Asn Gly Tyr Met Arg lie Thr Asn Glu Asn Phe Val Asp Ala Tyr Glu 

95 100 105 

aac tec aac tec act gag ttt gta age ctg gec age aag gtg aag gac 388 
Asn Ser Asn Ser Thr Glu Phe Val Ser Leu Ala Ser Lys Val Lys Asp 

110 115 120 

gcg ctg aag ctg ctg tac age gga gtc cca ttc ctg ggc ccc tac cac 436 
Ala Leu Lys Leu Leu Tyr Ser Gly Val Pro Phe Leu Gly Pro Tyr His 
125 130 135 

aag gag teg get gtg acg gec ttc age gag ggc age gtc ate gec tac 484 
Lys Glu Ser Ala Val Thr Ala Phe Ser Glu Gly Ser Val lie Ala Tyr 
140 145 150 

tac tgg tct gag ttc age ate ccg cag cac ctg gtg gag gag gee gag 532 
Tyr Trp Ser Glu Phe Ser lie Pro Gin His Leu Val Glu Glu Ala Glu 
155 160 165 170 

cgc gtc atg gec gag gag cgc gta gtc atg ctg ccc ccg egg gcg cgc 580 
Arg Val Met Ala Glu Glu Arg Val Val Met Leu Pro Pro Arg Ala Arg 

175 180 185 

tec ctg aag tec ttt gtg gtc ace tea gtg gtg get ttc ccc acg gac 628 
Ser Leu Lys Ser Phe Val Val Thr Ser Val Val Ala Phe Pro Thr Asp 

190 195 200 

tec aaa aca gta cag agg acc cag gac aac age tgc age ttt ggc ctg 676 
Ser Lys Thr Val Gin Arg Thr Gin Asp Asn Ser Cys Ser Phe Gly Leu 
205 210 " 215 

cac gee cgc ggt gtg gag ctg atg cgc ttc acc acg ccc ggc ttc cct 724 
His Ala Arg Gly Val Glu Leu Met Arg Phe Thr Thr Pro Gly Phe Pro 
220 225 230 

gac age ccc tac ccc get cat gee cgc tgc cag tgg gee ctg egg ggg 772 
Asp Ser Pro Tyr Pro Ala His Ala Arg Cys Gin Trp Ala Leu Arg Gly 
235 240 245 250 

gac gee gac tea gtg ctg age etc acc ttc cgc age ttt gac ctt gcg 820 
Asp Ala Asp Ser Val Leu Ser Leu Thr Phe Arg Ser Phe Asp Leu Ala 

255 260 265 

tec tgc gac gag cgc ggc age gac ctg gtg acg gtg tac aac acc ctg 868 
Ser Cys Asp Glu Arg Gly Ser Asp Leu Val Thr Val Tyr Asn Thr Leu 

270 275 280 

age ccc atg gag ccc cac gee ctg gtg cag ttg tgt ggc acc tac cct 916 
Ser Pro Met Glu Pro His Ala Leu Val Gin Leu Cys Gly Thr Tyr Pro 
285 290 295 

ccc tec tac aac ctg acc ttc cac tec tec cag aac gtc ctg etc ate 964 
Pro Ser Tyr Asn Leu Thr Phe His Ser Ser Gin Asn Val Leu Leu lie 
300 305 310 

aca ctg at a acc aac act gag egg egg cat ccc ggc ttt gag gee acc 1012 
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Thr Leu lie Thr Asn Thr Glu Arg Arg His Pro Gly Phe Glu Ala Thr 

315 320 ~ 325 330 

ttc ttc cag ctg cct agg atg age age tgt gga ggc cgc tta cgt aaa 1060 

Phe Phe Gin Leu Pro Arg Met Ser Ser Cys Gly Gly Arg Leu Arg Lys 

335 " 340 " 345 

gec cag ggg aca ttc aac age ccc tac tac cca ggc cac tac cca ccc 1108 

Ala Gin Gly Thr Phe Asn Ser Pro Tyr Tyr Pro Gly His Tyr Pro Pro 

350 355 " 360 

aac att gac tgc aca tgg aac att gag gtg ccc aac aac cag cat gtg 1156 

Asn lie Asp Cys Thr Trp Asn lie Glu Val Pro Asn Asn Gin His Val 

365 *" 370 375 

aag gtg age ttc aaa ttc ttc tac ctg ctg gag ccc ggc gtg cct gcg 12 04 

Lys Val Ser Phe Lys Phe Phe Tyr Leu Leu Glu Pro Gly Val Pro Ala 

380 385 390 

ggc acc tgc ccc aag gac tac gtg gag ate aat ggg gag aaa tac tgc 1252 

Gly Thr Cys Pro Lys Asp Tyr Val Glu lie Asn Gly Glu Lys Tyr Cys 

395 " 400 405 410 

gga gag agg tec cag ttc gtc gtc acc age aac age aac aag ate aca 13 00 

Gly Glu Arg Ser Gin Phe Val Val Thr Ser Asn Ser Asn Lys lie Thr 

415 420 425 



gtt cgc ttc cac tea gat cag tec tac acc gac acc ggc ttc tta get 
Val Arg Phe His Ser Asp Gin Ser Tyr Thr Asp Thr Gly Phe Leu Ala 

430 ~ 435 ' 440 



1348 



gaa tac etc tec tac gac tec agt gac cca tgc ccg ggg cag ttc acg 1396 

Glu Tyr Leu Ser Tyr Asp Ser Ser Asp Pro Cys Pro Gly Gin Phe Thr 
445 * ~ 450 * " 455 

tgc cgc acg ggg egg tgt ate egg aag gag ctg cgc tgt gat ggc tgg 1444 

Cys Arg Thr Gly Arg Cys lie Arg Lys Glu Leu Arg Cys Asp Gly Trp 
460 465 ' 470 

gee gac tgc acc gac cac age gat gag etc aac tgc agt tgc gac gee 1492 

Ala Asp Cys Thr Asp His Ser Asp Glu Leu Asn Cys Ser Cys Asp Ala 
475 480 485 490 

ggc cac cag ttc acg tgc aag aac aag ttc tgc aag ccc etc ttc tgg 154 0 

Gly His Gin Phe Thr Cys Lys Asn Lys Phe Cys Lys Pro Leu Phe Trp 

495 " 500 505 

gtc tgc gac agt gtg aac gac tgc gga gac aac age gac gag cag ggg 1588 

Val Cys Asp Ser Val Asn Asp Cys Gly Asp Asn Ser Asp Glu Gin Gly 

510 515 520 

tgc agt tgt ccg gee cag acc ttc agg tgt tec aat ggg aag tgc etc 1636 

Cys Ser Cys Pro Ala Gin Thr Phe Arg Cys Ser Asn Gly Lys Cys Leu 
525 530 535 

teg aaa age cag cag tgc aat ggg aag gac gac tgt ggg gac ggg tec 16 84 

Ser Lys Ser Gin Gin Cys Asn Gly Lys Asp Asp Cys Gly Asp Gly Ser 
540 545 550 
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gac gag gcc tec tgc ccc aag gtg aac gtc gtc act tgt acc aaa cac 1732 
Asp Glu Ala Ser Cys Pro Lys Val Asn Val Val Thr Cys Thr Lys His 
555 560 " 565 570 

acc tac cgc tgc etc aat ggg etc tgc ttg age aag ggc aac cct gag 1780 
Thr Tyr Arg Cys Leu Asn Gly Leu Cys Leu Ser Lys Gly Asn Pro Glu 

575 580 585 

tgt gac ggg aag gag gac tgt age gac ggc tea gat gag aag gac tgc 1828 
Cys Asp Gly Lys Glu Asp Cys Ser Asp Gly Ser Asp Glu Lys Asp Cys 

590 595 ' 600 

gac tgt ggg ctg egg tea ttc acg aga cag get cgt gtt gtt ggg ggc 1876 
Asp Cys Gly Leu Arg Ser Phe Thr Arg Gin Ala Arg Val Val Gly Gly 
605 610 615 

acg gat gcg gat gag ggc gag tgg ccc tgg cag gta age ctg cat get 1924 
Thr Asp Ala Asp Glu Gly Glu Trp Pro Trp Gin Val Ser Leu His Ala 
620 625 630 

V 

ctg ggc cag ggc cac ate tgc ggt get tec etc ate tct ccc aac tgg 1972 

Leu Gly Gin Gly His lie Cys Gly Ala Ser Leu lie Ser Pro Asn Trp 

635 640 645 650 

ctg gtc tct gcc gca cac tgc tac ate gat gac aga gga ttc agg tac 2020 
Leu Val Ser Ala Ala His Cys Tyr lie Asp Asp Arg Gly Phe Arg Tyr 

655 " 660 665 

tea gac ccc acg cag tgg acg gcc ttc ctg ggc ttg cac gac cag age 2068 
Ser Asp Pro Thr Gin Trp Thr Ala Phe Leu Gly Leu His Asp Gin Ser 

670 675 680 

cag cgc age gcc cct ggg gtg cag gag cgc agg etc aag cgc ate ate 2116 
Gin Arg Ser Ala Pro Gly Val Gin Glu Arg Arg Leu Lys Arg lie lie 
685 690 695 

tec cac ccc ttc ttc aat gac ttc acc ttc gac tat gac ate gcg ctg 2164 
Ser His Pro Phe Phe Asn Asp Phe Thr Phe Asp Tyr Asp lie Ala Leu 
700 . 705 710 

ctg gag ctg gag aaa ccg gca gag tac age tec atg gtg egg ccc ate 2212 
Leu Glu Leu Glu Lys Pro Ala Glu Tyr Ser Ser Met Val Arg Pro lie 
715 * 720 ** 725 730 

tgc ctg ccg gac gcc tec cat gtc ttc cct gcc ggc aag gcc ate tgg 2260 
Cys Leu Pro Asp Ala Ser His Val Phe Pro Ala Gly Lys Ala lie Trp 

735 740 745 

gtc acg ggc tgg gga cac acc cag tat gga ggc act ggc gcg ctg ate 2308 
Val Thr Gly Trp Gly His Thr Gin Tyr Gly Gly Thr Gly Ala Leu He 

750 755 760 

ctg caa aag ggt gag ate cgc gtc ate aac cag acc acc tgc gag aac 2356 
Leu Gin Lys Gly Glu He Arg Val He Asn Gin Thr Thr Cys Glu Asn 
765 770 775 

etc ctg ccg cag cag ate acg ccg cgc atg atg tgc gtg ggc ttc etc 2404 
Leu Leu Pro Gin Gin He Thr Pro Arg Met Met Cys Val Gly Phe Leu 
780 785 ~ 790 
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agc ggc ggc gtg gac tec tgc cag ggt gat tec ggg gga ccc ctg tec 2452 
Ser Gly Gly Val Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro Leu Ser 
795 * 800 805 810 



age gtg gag gcg gat ggg egg ate ttc cag gee ggt gtg gtg age tgg 
Ser Val Glu Ala Asp Gly Arg lie Phe Gin Ala Gly Val Val Ser Trp 

815 820 825 



2500 



gga gac ggc tgc get cag agg aac aag cca ggc gtg tac aca agg etc 254 8 
Gly Asp Gly Cys Ala Gin Arg Asn Lys Pro Gly Val Tyr Thr Arg Leu 

830 " 835 840 

cct ctg ttt egg gac tgg ate aaa gag aac act ggg gta ta ggggccgggg 2599 
Pro Leu Phe Arg Asp Trp lie Lys Glu Asn Thr Gly Val 
845 850 855 

ccacccaaat gtgtacacct gcggggccac ccatcgtcca ccccagtgtg cacgcctgca 2659 
ggctggagac tggaccgctg actgcaccag cgcccccaga acatacactg tgaactcaat 2719 
ctccagggct ccaaatctgc ctagaaaacc tctcgcttcc tcagcctcca aagtggagct 2779 
gggaggtaga aggggaggac actggtggtt ctactgaccc aactgggggc aaaggtttga 2 839 
agacacagcc tcccccgcca gccccaagct gggecgagge gcgtttgtgt atatctgect 2899 
cccctgtctg taaggagcag egggaaegga getteggage ctcctcagtg aaggtggtgg 2959 
ggctgccgga tctgggctgt ggggcccttg ggccacgctc ttgaggaagc ccaggctcgg 3 019 
aggaccctgg aaaacagacg ggtctgagac tgaaattgtt ttaccagctc ccagggtgga 3 079 
cttcagtgtg tgtatttgtg taaatgggta aaacaattta tttcttttta aaaaaaaaaa 3139 
aaaaaaaa 3147 

<210> 2 

<211> 855 

<212> PRT 

<213> Homo Sapien 



<400> 2 






























Met 


Gly 


Ser 


Asp 


Arg 


Ala 


Arg 


Lys 


Gly 


Gly 


Gly 


Gly 


Pro 


Lys 


Asp 


Phe 


1 






5 










10 










15 




Gly 


Ala 


Gly 


Leu 


Lys 


Tyr 


Asn 


Ser 


Arg 


His 


Glu 


Lys 


Val 


Asn 


Gly 


Leu 






20 










25 










30 






Glu 


Glu 


Gly 
35 


Val 


Glu 


Phe 


lieu 


Pro 
40 


Val 


Asn 


Asn 


Val 


Lys 
45 


Lys 


Val 


Glu 


Lys 


His 


Gly 


Pro 


Gly 


Arg 


Trp 


Val 


Val 


Leu 


Ala 


Ala 


Val 


Leu 


He 


Gly 


50 








55 










60 










Leu 


Leu 


Leu 


Val 


Leu 


Leu 


Gly 


lie 


Gly 


Phe 


Leu 


Val 


Trp 


His 


Leu 


Gin 


65 










70 






75 










80 


Tyr 


Arg 


Asp 


Val 


Arg 


Val 


Gin 


Lys 


Val 


Phe 


Asn 


Gly 


Tyr 


Met 


Arg 


He 




85 










90 










95 




Thr 


Asn 


Glu 


Asn 


Phe 


Val 


Asp 


Ala 


Tyr 


Glu 


Asn 


Ser 


Asn 


Ser 


Thr 


Glu 








100 








105 










110 






Phe 


Val 


Ser 


Leu 


Ala 


Ser 


Lys 


Val 


Lys 


Asp 


Ala 


Leu 


Lys 


Leu 


Leu 


Tyr 






115 








120 








125 








Ser 


Gly 
130 


Val 


Pro 


Phe 


Leu 


Gly 
135 


Pro 


Tyr 


His 


Lys 


Glu 
140 


Ser 


Ala 


Val 


Thr 


Ala 


Phe 


Ser 


Glu 


Gly 


Ser 


Val 


lie 


Ala 


Tyr 


Tyr 


Trp 


Ser 


Glu 


Phe 


Ser 


145 








150 










155 










160 


lie 


Pro 


Gin 


His 


Leu 
165 


Val 


Glu 


Glu 


Ala 


Glu 
170 


Arg 


Val 


Met 


Ala 


Glu 
175 


Glu 


Arg 


Val 


Val 


Met 


Leu 


Pro 


Pro 


Arg 


Ala 


Arg 


Ser 


Leu 


Lys 


Ser 


Phe 


Val 






180 










185 










190 






Val 


Thr 


Ser 


Val 


Val 


Ala 


Phe 


Pro 


Thr 


Asp 


Ser 


Lys 


Thr 


Val 


Gin 


Arg 
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195 

Thr Gin Asp Asn 
210 

Leu Met Arg Phe 
225 

His Ala Arg Cys 

Ser Leu Thr Phe 

260 

Ser Asp Leu Val 
275 

Ala Leu Val Gin 
290 

Phe His Ser Ser 
305 

Glu Arg Arg His 

Met Ser Ser Cys 

340 

Ser Pro Tyr Tyr 
355 

Asn He Glu Val 
370 

Phe Tyr Leu Leu 
385 

Tyr Val Glu He 

Val Val Thr Ser 

420 

Gin Ser Tyr Thr 
435 

Ser Ser Asp Pro 
450 

He Arg Lys Glu 
465 

Ser Asp Glu Leu 

Lys Asn Lys Phe 

500 

Asp Cys Gly Asp 
515 

Thr Phe Arg Cys 
530 

Asn Gly Lys Asp 
545 

Lys Val Asn Val 

Gly Leu Cys Leu 

580 

Cys Ser Asp Gly 
595 

Phe Thr Arg Gin 
610 

Glu Trp Pro Trp 
625 

Cys Gly Ala Ser 

Cys Tyr He Asp 

660 



200 

Ser Cys Ser Phe 
215 

Thr Thr Pro Gly 
230 

Gin Trp Ala Leu 
245 

Arg Ser Phe Asp 

Thr Val Tyr Asn 

280 

Leu Cys Gly Thr 
295 

Gin Asn Val Leu 
310 

Pro Gly Phe Glu 
325 

Gly Gly Arg Leu 

Pro Gly His Tyr 

360 

Pro Asn Asn Gin 
375 

Glu Pro Gly Val 
390 

Asn Gly Glu Lys 
405 

Asn Ser Asn Lys 

Asp Thr Gly Phe 

44 0 

Cys Pro Gly Gin 
455 

Leu Arg Cys Asp 
470 

Asn Cys Ser Cys 
485 

Cys Lys Pro Leu 

Asn Ser Asp Glu 

520 

Ser Asn Gly Lys 
535 

Asp Cys Gly Asp 
550 

Val Thr Cys Thr 
565 

Ser Lys Gly Asn 

Ser Asp Glu Lys 

600 

Ala Arg Val Val 
615 

Gin Val Ser Leu 
630 

Leu He Ser Pro 
645 

Asp Arg Gly Phe 



Gly Leu His Ala 

220 

Phe Pro Asp Ser 
235 

Arg Gly Asp Ala 
250 

Leu Ala Ser Cys 
265 

Thr Leu Ser Pro 

Tyr Pro Pro Ser 

300 

Leu He Thr Leu 
315 

Ala Thr Phe Phe 
330 

Arg Lys Ala Gin 
345 

Pro Pro Asn He 

His Val Lys Val 

380 

Pro Ala Gly Thr 
395 

Tyr Cys Gly Glu 
410 

He Thr Val Arg 
425 

Leu Ala Glu Tyr 

Phe Thr Cys Arg 

460 

Gly Trp Ala Asp 
475 

Asp Ala Gly His 
490 

Phe Trp Val Cys 
505 

Gin Gly Cys Ser 

Cys Leu Ser Lys 

540 

Gly Ser Asp Glu 
555 

Lys His Thr Tyr 
570 

Pro Glu Cys Asp 
585 

Asp Cys Asp Cys 

Gly Gly Thr Asp 

62 0 

His Ala Leu Gly 
635 

Asn Trp Leu Val 
650 

Arg Tyr Ser Asp 
665 



205 

Arg Gly Val Glu 

Pro Tyr Pro Ala 

240 

Asp Ser Val Leu 
255 

Asp Glu Arg Gly 
270 

Met Glu Pro His 
285 

Tyr Asn Leu Thr 

He Thr Asn Thr 

320 

Gin Leu Pro Arg 
335 

Gly Thr Phe Asn 
350 

Asp Cys Thr Trp 
365 

Ser Phe Lys Phe 

Cys Pro Lys Asp 

400 

Arg Ser Gin Phe 
415 

Phe His Ser Asp 
430 

Leu Ser Tyr Asp 
445 

Thr Gly Arg Cys 

Cys Thr Asp His 

480 

Gin Phe Thr Cys 
495 

Asp Ser Val Asn 
510 

Cys Pro Ala Gin 
525 

Ser Gin Gin Cys 

Ala Ser Cys Pro 

560 

Arg Cys Leu Asn 
575 

Gly Lys Glu Asp 
590 

Gly Leu Arg Ser 
605 

Ala Asp Glu Gly 

Gin Gly His He 

640 

Ser Ala Ala His 
655 

Pro Thr Gin Trp 
670 
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Thr 


Ala 


Phe 


Leu Gly Leu His 


Asp 


Gin Ser Gin Arg Ser 


Ala Pro 


Gly 






675 


680 






685 






Val 


Gin 




&TO Z\ T*CT TiPMI TiV! 

xV-i_ ^ Lieu JJjr O 




He 


He 


Ser His Pro 


Phe Phe 


Asn 




690 




695 








700 






Asp 


Phe 


Thr 


Phe Asp" Tyr Asp 


He 


Ala 


Leu 


Leu Glu' Leu 


Glu Lys 


Pro 


705 






710 








715 




720 


Ala 


Glu 


Tyr 


Ser Ser Met Val 


Arg 


Pro 


He 


Cys Leu Pro 


Asp Ala 


Ser 






725 






730 




735 




His 


Val 


Phe 


Pro Ala Gly Lys 


Ala 


He 


Trp 


Val Thr Gly 


Trp Gly His 








740 




745 






750 




Thr 


Gin 


Tyr 


Gly Gly Thr Gly 


Ala 


Leu 


He 


Leu Gin Lys 


Gly Glu 


He 






755 




760 






765 






Arg 


Val 


He 


Asn Gin Thr Thr 


Cys 


Glu 


Asn 


Leu Leu Pro 


Gin Gin 


He 


770 




775 






780 






Thr 


Pro 


Arg 


Met Met Cys Val 


Gly 


Phe 


Leu Ser Gly Gly 


Val Asp 


Ser 


785 




790 








795 




800 


Cys 


Gin 


Gly 


Asp Ser Gly Gly 


Pro 


Leu 


Ser 


Ser Val Glu 


Ala Asp 


Gly 




805 






810 




815 




Arg 


He 


Phe 


Gin Ala Gly Val 


Val 


Ser 


Trp 


Gly Asp Gly 


Cys Ala 


Gin 








820 




825 






830 




Arg 


Asn 


Lys 


Pro Gly Val Tyr 


Thr 


Arg 


Leu 


Pro Leu Phe 


Arg Asp 


Trp 






835 




840 






845 






lie 


Lys 


Glu 


Asn Thr Gly Val 
















850 




855 














<210> 3 



















<211> 3147 

<212> DNA 

<213> Homo Sapien 

<220> 
<221> CDS 

<222> (1865) . . . (2590) 

<223> Nucleic acid sequence of protease domain of MTSP1 
<400> 3 

tcaagagcgg cctcggggta ccatggggag cgatcgggcc cgcaagggcg gagggggccc 60 
gaaggacttc ggcgcgggac tcaagtacaa ctcccggcac gagaaagtga atggcttgga 12 0 
ggaaggcgtg gagttcctgc cagtcaacaa cgtcaagaag gtggaaaagc atggcccggg 180 
gcgctgggtg gtgctggcag ccgtgctgat cggcctcctc ttggtcttgc tggggatcgg 240 
cttcctggtg tggcatttgc agtaccggga cgtgcgtgtc cagaaggtct tcaatggcta 30 0 
catgaggatc acaaatgaga attttgtgga tgcctacgag aactccaact ccactgagtt 3 60 
tgtaagcctg gccagcaagg tgaaggacgc gctgaagctg ctgtacagcg gagtcccatt 420 
cctgggcccc taccacaagg agtcggctgt gacggccttc agcgagggca gcgtcatcgc 480 
ctactactgg tctgagttca gcatcccgca gcacctggtg gaggaggccg agcgcgtcat 540 
ggccgaggag cgcgtagtca tgctgccccc gcgggcgcgc tccctgaagt cctttgtggt 600 
cacctcagtg gtggctttcc ccacggactc caaaacagta cagaggaccc aggacaacag 660 
ctgcagcttt ggcctgcacg cccgcggtgt ggagctgatg cgcttcacca cgcccggctt 720 
ccctgacagc ccctaccccg ctcatgcccg ctgccagtgg gccctgcggg gggacgccga 780 
ctcagtgctg agcctcacct tccgcagctt tgaccttgcg tcctgcgacg agcgcggcag 840 
cgacctggtg acggtgtaca acaccctgag ccccatggag ccccacgccc tggtgcagtt 900 
gtgtggcacc taccctccct cctacaacct gaccttccac tcctcccaga acgtcctgct 960 
catcacactg ataaccaaca ctgagcggcg gcatcccggc tttgaggcca ccttcttcca 1020 
gctgcctagg atgagcagct gtggaggccg cttacgtaaa gcccagggga cattcaacag 1080 
cccctactac ccaggccact acccacccaa cattgactgc acatggaaca ttgaggtgcc 1140 
caacaaccag catgtgaagg tgagcttcaa attcttctac ctgctggagc ccggcgtgcc 12 00 
tgcgggcacc tgccccaagg actacgtgga gatcaatggg gagaaatact gcggagagag 1260 
gtcccagttc gtcgtcacca gcaacagcaa caagatcaca gttcgcttcc actcagatca 1320 
gtcctacacc gacaccggct tcttagctga atacctctcc tacgactcca gtgacccatg 13 80 
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cccggggcag ttcacgtgcc gcacggggcg gtgtatccgg aaggagctgc gctgtgatgg 1440 
ctgggccgac tgcaccgacc acagcgatga gctcaactgc agttgcgacg ccggccacca 1500 
gttcacgtgc aagaacaagt tctgcaagcc cctcttctgg gtctgcgaca gtgtgaacga 1560 
ctgcggagac aacagcgacg agcaggggtg cagttgtccg gcccagacct tcaggtgttc 1620 
caatgggaag tgcctctcga aaagccagca gtgcaatggg aaggacgact gtggggacgg 1680 
gtccgacgag gcctcctgcc ccaaggtgaa cgtcgtcact tgtaccaaac acacctaccg 1740 
ctgcctcaat gggctctgct tgagcaaggg caaccctgag tgtgacggga aggaggactg 1800 
tagcgacggc tcagatgaga aggactgcga ctgtgggctg cggtcattca cgagacaggc 1860 
tcgt gtt gtt ggg ggc acg gat gcg gat gag ggc gag tgg ccc tgg cag 1909 
Val Val Gly Gly Thr Asp Ala Asp Glu Gly Glu Trp Pro Trp Gin 
1 5 10 15 

gta age ctg cat get ctg ggc cag ggc cac ate tgc ggt get tec etc 1957 
Val Ser lieu His Ala Leu Gly Gin Gly His lie Cys Gly Ala Ser Leu 

20 25 30 

ate tct ccc aac tgg ctg gtc tct gec gca cac tgc tac ate gat gac 2005 
lie Ser Pro Asn Trp Leu Val Ser Ala Ala His Cys Tyr lie Asp Asp 

35 40 45 

aga gga ttc agg tac tea gac ccc acg cag tgg acg gec ttc ctg ggc 2053 
Arg Gly Phe Arg Tyr Ser Asp Pro Thr Gin Trp Thr Ala Phe Leu Gly 
50 55 60 

ttg cac gac cag age cag cgc age gec cct ggg gtg cag gag cgc agg 2101 
Leu His Asp Gin Ser Gin Arg Ser Ala Pro Gly Val Gin Glu Arg Arg 
65 70 75 

etc aag cgc ate ate tec cac ccc ttc ttc aat gac ttc ace ttc gac 2149 
Leu Lys Arg lie lie Ser His Pro Phe Phe Asn Asp Phe Thr Phe Asp 
80 85 90 95 

tat gac ate gcg ctg ctg gag ctg gag aaa ccg gca gag tac age tec 2197 
Tyr Asp lie Ala Leu Leu Glu Leu Glu Lys Pro Ala Glu Tyr Ser Ser 

100 105 ~ 110 

atg gtg egg ccc ate tgc ctg ccg gac gec tec cat gtc ttc cct gec 2245 
Met Val Arg Pro lie Cys Leu Pro Asp Ala Ser His Val Phe Pro Ala 

115 120 125 

ggc aag gec ate tgg gtc acg ggc tgg gga cac acc cag tat gga ggc 2293 
Gly Lys Ala He Trp Val Thr Gly Trp Gly His Thr Gin Tyr Gly Gly 
130 135 140 

act ggc gcg ctg ate ctg caa aag ggt gag ate cgc gtc ate aac cag 2341 
Thr Gly Ala Leu He Leu Gin Lys Gly Glu He Arg Val lie Asn Gin 
145 150 155 

acc acc tgc gag aac etc ctg ccg cag cag ate acg ccg cgc atg atg 23 89 
Thr Thr Cys Glu Asn Leu Leu Pro Gin Gin He Thr Pro Arg Met Met 
160 165 170 ' 175 

tgc gtg ggc ttc etc age ggc ggc gtg gac tec tgc cag ggt gat tec 2437 
Cys Val Gly Phe Leu Ser Gly Gly Val Asp Ser Cys Gin Gly Asp Ser 

180 185 190 

999 gga ccc ctg tec age gtg gag gcg gat ggg egg ate ttc cag gec 2485 
Gly Gly Pro Leu Ser Ser Val Glu Ala Asp Gly Arg He Phe Gin Ala 

195 200 205 
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ggt gtg gtg age tgg gga gac ggc tgc get cag agg aac aag cca ggc 2533 
Gly Val Val Ser Trp Gly Asp Gly Cys Ala Gin Arg Asn Lys Pro Gly 
210 215 220 

gtg tac aca agg etc cct ctg ttt egg gac tgg ate aaa gag aac* act . 2581 
Val Tyr Thr Arg Leu Pro Leu Phe Arg Asp Trp He Lys Glu Asn Thr 
225 230 235 

999 9 ta tag gggcegggge cacccaaatg tgtacacctg cggggccacc 2630 
Gly Val * 

240 

catcgtccac cccagtgtgc acgcctgcag gctggagact ggaccgctga ctgcaccagc 2690 
gcccccagaa catacactgt gaactcaatc tccagggctc caaatctgcc tagaaaacct 2750 
ctcgcttcct cagcctccaa agtggagctg ggaggtagaa ggggaggaca ctggtggttc 2810 
tactgaccca actgggggca aaggtttgaa gacacagcct cccccgccag ccccaagctg 2870 
ggccgaggcg cgtttgtgta tatctgcctc ccctgtctgt aaggagcagc gggaaeggag 293 0 
cttcggagcc tcctcagtga aggtggtggg getgeeggat ctgggctgtg gggcccttgg 2990 
gccacgctct tgaggaagee caggctegga ggaccctgga aaacagaegg gtctgagact 3050 
gaaattgttt taccagctcc cagggtggac ttcagtgtgt gtatttgtgt aaatgggtaa 3110 
aacaatttat ttctttttaa aaaaaaaaaa aaaaaaa "* ** 3147 

<210> 4 

<211> 241 

<212> PRT 

<213> Homo Sapien 

<400> 4 

Val Val Gly Gly Thr Asp Ala Asp Glu Gly Glu Trp Pro Trp Gin Val 

1 5 io " 15 

Ser Leu His Ala Leu Gly Gin Gly His He Cys Gly Ala Ser Leu He 

20 25 " 30 

Ser Pro Asn Trp Leu Val Ser Ala Ala His Cys Tyr He Asp Asp Arg 

35 40 45 

Gly Phe Arg Tyr Ser Asp Pro Thr Gin Trp Thr Ala Phe Leu Gly Leu 

50 55 60 

His Asp Gin Ser Gin Arg Ser Ala Pro Gly Val Gin Glu Arg Arg Leu 
65 70 75 " 80 

Lys Arg He lie Ser His Pro Phe Phe Asn Asp Phe Thr Phe Asp Tyr 

85 90 95 

Asp He Ala Leu Leu Glu Leu Glu Lys Pro Ala Glu Tyr Ser Ser Met 

100 105 ~ no 

Val Arg Pro He Cys Leu Pro Asp Ala Ser His Val Phe Pro Ala Gly 

115 120 125 

Lys Ala He Trp Val Thr Gly Trp Gly His Thr Gin Tyr Gly Gly Thr 

130 135 140 

Gly Ala Leu He Leu Gin Lys Gly Glu He Arg Val He Asn Gin Thr 
145 150 155 160 

Thr Cys Glu Asn Leu Leu Pro Gin Gin He Thr Pro Arg Met Met Cys 

165 170 175 

Val Gly Phe Leu Ser Gly Gly Val Asp Ser Cys Gin Gly Asp Ser Gly 

180 185 190 

Gly Pro Leu Ser Ser Val Glu Ala Asp Gly Arg He Phe Gin Ala Gly 

195 200 ~ 205 

Val Val Ser Trp Gly Asp Gly Cys Ala Gin Arg Asn Lys Pro Gly Val 

210 215 220 

Tyr Thr Arg Leu Pro Leu Phe Arg Asp Trp He Lys Glu Asn Thr Gly 
225 230 235 240 
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Val 

<210> 5 
<211> 2293 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> CVSP16 Pull Length cDNA 
<221> CDS 

<222> (1) . . . (2259) 

<223> CVSP16 Pull Length 

<400> 5 

atg gcc egg cag ctg etc etc ccc ctt gtg gtg ctt gtc ate agt ccc 48 
Met Ala Arg Gin Leu Leu Leu Pro Leu Val Val Leu Val He Ser Pro 
15 10 15 

ate cca gga gcc ttc cag gac tea get etc agt cct acc cag gaa gaa 96 
He Pro Gly Ala Phe Gin Asp Ser Ala Leu Ser Pro Thr Gin Glu Glu 

20 25 30 

cct gaa gat ctg gac tgc ggg cgc cct gag ccc teg gcc cgc ate gtg 144 
Pro Glu Asp Leu Asp Cys Gly Arg Pro Glu Pro Ser Ala Arg He Val 
35 40 45 

999 99 c tea aac gcg cag ccg ggc acc tgg cct tgg caa gtg age ctg 192 
Gly Gly Ser Asn Ala Gin Pro Gly Thr Trp Pro Trp Gin Val Ser Leu 
50 55 ~ 60 

cac cat gga ggt ggc cac ate tgc ggg ggc tec etc ate gcc ccc tec 240 
His His Gly Gly Gly His He Cys Gly Gly Ser Leu He Ala Pro Ser 
65 70 75 80 

tgg gtc etc tec gcc get cac tgt ttc atg acg aat ggg acg ctg gag 288 
Trp Val Leu Ser Ala Ala His Cys Phe Met Thr Asn Gly Thr Leu Glu 

85 90 " 95 

ccc gcg gcc gag tgg teg gta ctg ctg ggc gtg cac tec cag gac ggg 336 
Pro Ala Ala Glu Trp Ser Val Leu Leu Gly Val His Ser Gin Asp Gly 

100 105 110 

ccc ctg gac ggc gcg cac acc cgc gca gtg gcc gcc ate gtg gtg ccg 384 
Pro Leu Asp Gly Ala His Thr Arg Ala Val Ala Ala He Val Val Pro 
115 120 125 

gcc aac tac age caa gtg gag ctg ggc gcc gac ctg gcc ctg ctg cgc 432 
Ala Asn Tyr Ser Gin Val Glu Leu Gly Ala Asp Leu Ala Leu Leu Arg 
130 135 * 140 

ctg gcc tea ccc gcc age ctg ggc ccc gcc gtg tgg cct gtc tgc ctg 480 
Leu Ala Ser Pro Ala Ser Leu Gly Pro Ala Val Trp Pro Val Cys Leu 
145 150 155 ~ 160 

ccc cgc gcc tea cac cgc ttc gtg cac ggc acc gcc tgc tgg gcc acc 528 
Pro Arg Ala Ser His Arg Phe Val His Gly Thr Ala Cys Trp Ala Thr 

165 170 * 175 
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ggc tgg gga gac gtc cag gag gca gat cct ctg cct etc ccc tgg gtg 
Gly Trp Gly Asp Val Gin Glu Ala Asp Pro Leu Pro Leu Pro Trp Val 

180 185 190 



cct gga gtt ttc act get gtg get acc tat gag gca tgg ata egg gag 
Pro Gly Val Phe Thr Ala Val Ala Thr Tyr Glu Ala Trp lie Arg Glu 
275 280 285 



aag acc cag tea gat ccc cag gag ccc agg gag gag aac tgc acc att 

Lys Thr Gin Ser Asp Pro Gin Glu Pro Arg Glu Glu Asn Cys Thr lie 

305 310 315 320 

gee ctg cct gag tgc ggg aag gec ccg egg cca ggg gee tgg ccc tgg 

Ala Leu Pro Glu Cys Gly Lys Ala Pro Arg Pro Gly Ala Trp Pro Trp 

325 330 335 



gtg tct gaa age tgg gtc ttg gca cct gee age tgc ttt ctg gac ccg 
Val Ser Glu Ser Trp Val Leu Ala Pro Ala Ser Cys Phe Leu Asp Pro 
355 360 365 



576 



eta cag gaa gtg gag eta agg ctg ctg ggc gag gee acc tgt caa tgt 624 

Leu Gin Glu Val Glu Leu Arg Leu Leu Gly Glu Ala Thr Cys Gin Cys 
195 200 205 

etc tac age cag ccc ggt ccc ttc aac etc act etc cag ata ttg cca 672 

Leu Tyr Ser Gin Pro Gly Pro Phe Asn Leu Thr Leu Gin lie Leu Pro 

210 215 220 

ggg atg ctg tgt get ggc tac cca ggg ggc cgc agg gac acc tgc cag 72 0 

Gly Met Leu Cys Ala Gly Tyr Pro Gly Gly Arg Arg Asp Thr Cys Gin 
225 230 235 240 

ggt gac tct ggg ggg ccc ctg gtc tgt gag gaa ggc ggc cgc tgg ttc 768 

Gly Asp Ser Gly Gly Pro Leu Val Cys Glu Glu Gly Gly Arg Trp Phe 

245 250 255 

cag gca gga ate acc age ttt ggc ttt ggc tgt gga egg aga aac cgc 816 
Gin Ala Gly lie Thr Ser Phe Gly Phe Gly Cys Gly Arg Arg Asn Arg 

260 265 270 



864 



cag gtg atg ggt tea gag cct ggg cct gee ttt ccc acc cag ccc cag 912 
Gin Val Met Gly Ser Glu Pro Gly Pro Ala Phe Pro Thr Gin Pro Gin 
290 " 295 300 



960 



1008 



gag gee cag gtg atg gtg cca gga tec aga ccc tgc cat ggg gcg ctg 1056 
Glu Ala Gin Val Met Val Pro Gly Ser Arg Pro Cys His Gly Ala Leu 

340 345 350 



1104 



aac age tec gac age cca ccc cgc gac etc gac gee tgg cgc gtg ctg 1152 

Asn Ser Ser Asp Ser Pro Pro Arg Asp Leu Asp Ala Trp Arg Val Leu 
370 375 380 

ctg ccc teg cac ccg cgc gcg gag egg gtg gcg cgc ctg gtg cag cac 1200 

Leu Pro Ser His Pro Arg Ala Glu Arg Val Ala Arg Leu Val Gin His 
385 390 " 395 400 

gag aac get teg tgg gac aac gec ccg gac ctg gcg ctg ctg cag ctg 1248 

Glu Asn Ala Ser Trp Asp Asn Ala Pro Asp Leu Ala Leu Leu Gin Leu 

405 ~ 410 415 
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cgc acg ccc gtg aac ctg agt gcg get teg egg ccc gtg tgc eta ccc 1296 
Arg Thr Pro Val Asn Leu Sex Ala Ala Ser Arg Pro Val Cys Leu Pro 

420 425 ~ 430 

cac ccg gaa cac tac ttc ctg ccc ggg age cgc tgc cgc ctg gec cgc 1344 
His Pro Glu His Tyr Phe Leu Pro Gly Ser Arg Cys Arg Leu Ala Arg 
435 440 ~ " 445 

tgg 99 c cgc ggg gaa ccc gcg ctt ggc cca ggc gcg ctg ctg gag gcg 1392 
Trp Gly Arg Gly Glu Pro Ala Leu Gly Pro Gly Ala Leu Leu Glu Ala 
450 455 460 

gag ctg tta ggc ggc tgg tgg tgc . cac tgc ctg tac ggc cgc cag ggg 1440 
Glu Leu Leu Gly Gly Trp Trp Cys His Cys Leu Tyr Gly Arg Gin Gly 
465 470 475 480 

gcg gca gta ccg ctg ccc gga gac ccg ccg cac gcg etc tgc cct gee 1488 
Ala Ala Val Pro Leu Pro Gly Asp Pro Pro His Ala Leu Cys Pro Ala 

485 490 495 

tac cag gaa aag gag gag gtg ggc age tgc tgg aat gac teg cgt tgg 1536 
Tyr Gin Glu Lys Glu Glu Val Gly Ser Cys Trp Asn Asp Ser Arg Trp 

500 505 510 

age ctt ttg tgc cag gag gag ggg ace tgg ttt ctg get gga ate aga 15 84 
Ser Leu Leu Cys Gin Glu Glu Gly Thr Trp Phe Leu Ala Gly He Arg 
515 520 525 

gac ttt ccc agt ggc tgt eta cgt ccc cga gee ttc ttc cct ctg cag 1632 
Asp Phe Pro Ser Gly Cys Leu Arg Pro Arg Ala Phe Phe Pro Leu Gin 
530 535 540 

act cat ggc cca tgg ate age cat gtg act egg gga gec tac ctg gag 1680 
Thr His Gly Pro Trp He Ser His Val Thr Arg Gly Ala Tyr Leu Glu 
545 55,0 555 560 

gac cag eta gec tgg gac tgg ggc cct gat ggg gag gag act gag aca 1728 
Asp Gin Leu Ala Trp Asp Trp Gly Pro Asp Gly Glu Glu Thr Glu Thr 

565 570 575 

cag act tgt ccc cca cac aca gag cat ggt gee tgt ggc ctg egg ctg 1776 
Gin Thr Cys Pro Pro His Thr Glu His Gly Ala Cys Gly Leu Arg Leu 

580 585 590 

gag get get cca gtg ggg gtc ctg tgg ccc tgg ctg gca gag gtg cat 1824 
Glu Ala Ala Pro Val Gly Val Leu Trp Pro Trp Leu Ala Glu Val His 
595 600 " ~ 605 

gtg get ggt gat cga gtc tgc act ggg ate etc ctg gec cca ggc tgg 1872 
Val Ala Gly Asp Arg Val Cys Thr Gly He Leu Leu Ala Pro Gly Trp 
610 615 620 

gtc ctg gca gec act cac tgt gtc etc agg cca ggc tct aca aca gtg 1920 
Val Leu Ala Ala Thr His Cys Val Leu Arg Pro Gly Ser Thr Thr Val 
625 630 635 640 

cct tac att gaa gtg tat ctg ggc egg gca ggg gee age tec etc cca 1968 
Pro Tyr He Glu Val Tyr Leu Gly Arg Ala Gly Ala Ser Ser Leu Pro 
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645 650 655 



<210> 6 
<211> 752 
<212> PRT 

<213> Artificial Sequence . . 
<220> 

<223> CVSP16 Full Length Protein 



2016 



cag ggc cac cag atg acc tea gca ccg ccc etc ctg tgc cag atg acg 
Gin Gly His Gin Met Thr Ser Ala Pro Pro Leu Leu Cys Gin Met Thr 

660 665 670 

gaa ggg tec tgg ate etc gtg ggc atg get gtt caa ggg age egg gag 2064 
Glu Gly Ser Trp lie Leu Val Gly Met Ala Val Gin Gly Ser Arg Glu 
675 "* 680 685 

ctg ttt get gee att ggt cct gaa gag gee tgg ate tec cag aca gtg 2112 
Leu Phe Ala Ala lie Gly Pro Glu Glu Ala Trp lie Ser Gin Thr Val 
690 695 700 

gga gag gec aac ttc ctg ccc ccc agt ggc tec cca cac tgg ccc act 
Glv Glu Ala Asn Phe Leu Pro Pro Ser Gly Ser Pro His Trp Pro Thr 
705 710 715 720 

gga ggc age aat etc tgc ccc cca gaa ctg gee aag gec teg gga tec 
Gly Gly Ser Asn Leu Cys Pro Pro Glu Leu Ala Lys Ala Ser Gly Ser 

725 " 730 735 

ccg cat gca gtc tac ttc ctg etc ctg ctg act etc ctg ate cag age 
Pro His Ala Val Tyr Phe Leu Leu Leu Leu Thr Leu Leu lie Gin Ser 

740 745 750 



2160 



2208 



2256 



tga ggggctaggg tcccagcacc acttccccct tctc 2293 



<400> 6 








Met 


Ala 


Arg 


Gin 


Leu 


1 








5 


lie 


Pro 


Gly 


Ala 


Phe 








20 




Pro 


Glu 


Asp 


Leu 


Asp 






35 






Gly 


Gly 


Ser 


Asn 


Ala 




50 








His 


His 


Gly 


Gly 


Gly 


65 










Trp 


Val 


Leu 


Ser 


Ala 








85 


Pro 


Ala 


Ala 


Glu 


Trp 








100 




Pro 


Leu 


Asp 


Gly Ala 






115 






Ala 


Asn 


Tyr 


Ser 


Gin 




130 








Leu 


Ala 


Ser 


Pro 


Ala 



Leu 


Leu 


Pro 


Leu Val 








10 


Gin 


Asp 


Ser 


Ala Leu 






25 


Cys 


Gly 


Arg 


Pro Glu 






40 




Gin 


Pro 


Gly 


Thr Trp 




55 






His 


lie 


Cys 


Gly Gly 


70 








Ala 


His 


Cys 


Phe Met 






90 


Ser 


Val 


Leu 


Leu Gly 








105 


His 


Thr 


Arg 


Ala Val 






120 




Val 


Glu 


Leu 


Gly Ala 




135 






Ser 


Leu 


Gly 


Pro Ala 



Val Leu Val He Ser Pro 

15 

Ser Pro Thr Gin Glu Glu 

30 

Pro Ser Ala Arg He Val 
45 

Pro Trp Gin Val Ser Leu 
60 

Ser Leu He Ala Pro Ser 
75 80 
Thr Asn Gly Thr Leu Glu 

95 

Val His Ser Gin Asp Gly 

110 

Ala Ala lie Val Val Pro 
125 

Asp Leu Ala Leu Leu Arg 
140 

Val Trp Pro Val Cys Leu 
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145 



Pro Arg 


Ala 


Ser 


Gly Trp 


Gly 


Asp 








180 


Leu 


Gin 


Glu 


Val 






195 




Leu 


Tyr 


Ser 


Gin 




210 






Gly 


Met 


Leu 


Cys 


225 








Gly Asp 


Ser 


Gly 


Gin 


Ala 


Gxy 


x±e 








*^ ^ 
260 


Pro Gly 


Val 


Pne 






*»-s r—» 1^ 

275 




Gin 


Val 


Met 


Gly 




290 






Lys 


Thr 




Ser 


305 








Ala 


Leu 


Pro 


Glu 


Glu 


Ala 


Gxn 


vai 








~\ A f~\ 

340 


Val 


Ser 


GlU 


Ser 






355 




Asn 


Ser 


Ser 


Asp 




370 






Leu 


Pro 


Ser 


His 


385 








Glu 


Asn 


Ala 


Ser 


Arg Thr 


Pro 


Val 








420 


His 


Pro 


Glu 


His 






435 




Trp 


Gly 


Arg 


Gly 




450 






Glu 


Leu 


Leu 


Gly 


465 








Ala 


Ala 


Val 


Pro 


Tyr 


Gin 


Glu 


Lys 








500 


Ser 


Leu 


Leu 


Cys 






515 




Asp 


Phe 


Pro 


Ser 




530 






Thr 


His 


Gly 


Pro 


545 








Asp 


Gin 


Leu 


Ala 


Gin 


Thr 


Cys 


Pro 








580 


Glu 


Ala 


Ala 


Pro 






595 




Val 


Ala 


Gly 


Asp 




610 







150 



xlXS 


Arg 


Phe 


XT' a 1 

vax 


165 








Val 


Gin 


Glu 


Ax a 


GlU 


Leu Arg 


Leu 








• "i r\ f\ 


rro 


Gly 


Pro 


lrIJ.tr 






215 




Ala 


Gly 


Tyr 


Pro 




230 








Pro 


Leu 


vai 


245 








Thr 


Ser 


Phe 


Gly 


inr 


Ala 


Val 










"*» d r\ 

280 


ber 


Glu 


Pro 


Gly 






295 




Asp 


Pro 


Gin 


Glu 




310 






cys 


Gly Lys 


TV 1 

Ala 


o ^ 

325 








' Mq -4— 

fieu 


Val 


Pro 


vjiJiy 


Trp 


Val 


Leu 


Ala 








f f\ 
360 


Ser 


Pro 


Pro 


Arg 






375 




Pro 


Arg Ala 


Glu 




390 






Trp 


Asp 


Asn 


Ala 


405 








"TV M W* 

Asn 


Leu 


Ser 


Ala 


Tyr 


Phe 


Leu 


Pro 








440 


Glu 


Pro 


Ala 


Leu 






455 




Gly 


Trp 


Trp 


Cys 




470 






Leu 


Pro Gly 


Asp 


485 








Glu 


Glu 


Val 


Gly 


Gin 


Glu 


Glu 


Gly 








520 


Gly 


Cys 


Leu 


Arg 






535 




Trp 


lie 


Ser 


His 




550 






Trp 


Asp 


Trp 


Gly 


565 








Pro 


His 


Thr 


Glu 


Val 


Gly Val 


Leu 








600 


Arg 


Val 


Cys 


Thr 






615 





155 

His Gly Thr Ala 
170 

Asp Pro Leu Pro 
185 

Leu Gly Glu Ala 

Asn Leu Thr Leu 

220 

Gly Gly Arg Arg 
235 

Cys Glu Glu Gly 
250 

Phe Gly Cys Gly 
265 

Thr Tyr Glu Ala 

Pro Ala Phe Pro 

300 

Pro Arg Glu Glu 
315 

Pro Arg Pro Gly 
330 

Ser Arg Pro Cys 
345 

Pro Ala Ser Cys 

Asp Leu Asp Ala 

380 

Arg Val Ala Arg 
395 

Pro Asp Leu Ala 
410 

Ala Ser Arg Pro 
425 

Gly Ser Arg Cys 

Gly Pro Gly Ala 

460 

His Cys Leu Tyr 
475 

Pro Pro His Ala 
490 

Ser Cys Trp Asn 
505 

Thr Trp Phe Leu 

Pro Arg Ala Phe 

540 

Val Thr Arg Gly 
555 

Pro Asp Gly Glu 
570 

His Gly Ala Cys 
585 

Trp Pro Trp Leu 

Gly lie Leu Leu 

620 









i fin 

X D \J 


Lys 


Trp 


Ala 


i 1U- 






175 




lieu 


Pro 


Trp 


Val 




190 






Thr 


Cys 


Gin 


Cys 


1 f\ c 








Gj.n 


lie 


Leu 


fro 




Thr 


Cys 


f2l n 








<irr v 




Arg 


Trp 








255 




7\ m iff 

Arg 


Arg Asn 


Arg 




270 






Trp 


lie 


Arg 


Glu 










inr 


Gin 


Pro 




Asn 


Cys 


Thr 


Xlc 








«j \j 


Axa 


Trp 


Pro 


arp 






335 




illb 


Gly Ala 


IicU 




350 






Phe 


Leu 


Asp 


Pro 










irp 


Arg 


Val 


iieii 


Leu 


Val 


Gin 


£1X s 








400 
*± V/ VJ 


Leu 


Leu 


Gin 


Lieu 






415 




vax 


Cys 


Leu 


riO 




430 






Arg 


Leu 


Ala 


Arg 


445 








Leu 


Leu 


Glu 


Ala 


Gly 


Arg 


Gin 


Gly 








480 


Leu 


Cys 


Pro 


Ala 






495 




Asp 


Ser 


Arg 


Trp 




510 






Ala 


Gly 


He 


Arg 


525 








Phe 


Pro 


Leu 


Gin 


Ala 


Tyr 


Leu 


Glu 






560 


Glu 


Thr 


Glu 


Thr 






575 




Gly 


Leu 


Arg 


Leu 




590 






Ala 


Glu 


Val 


His 


605 








Ala 


Pro 


Gly 


Trp 
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Val 


Leu 


Ala 


Ala 


Thr 


His 


Cys 


Val 


625 










630 




Pro 


Tyr 


lie 


Glu 


Val 
645 


Tyr 


Leu 


Gly 


Gin 


Gly 


His 


Gin 
660 


Met 


Thr 


Ser 


Ala 


Glu 


Gly 


Ser 
675 


Trp 


He 


Leu 


Val 


Gly 
680 


Leu 


Phe 


Ala 


Ala 


He 


Gly 


Pro 


Glu 




690 








695 




Gly 


Glu 


Ala 


Asn 


Phe 


Leu 


Pro 


Pro 


705 










710 






Gly 


Gly 


Ser 


Asn 


Leu 
725 


Cys 


Pro 


Pro 


Pro 


His 


Ala 


Val 
740 


Tyr 


Phe 


Leu 


Leu 



<210> 7 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 7 

ccctctgggt agccagcaca cagcatc 

<210> 8 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 8 . 

gccatcgtgg tgccggccaa ctacag 

<210> 9 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 9 

gcacacagca tccctggcaa tatctgg 

<210> 10 
<211> 26 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 



-15- 



Leu 


Arg 


Pro 


Gly Ser 


Thr 


Thr 


Val 






635 










640 


Arg 


Ala 


Gly Ala 


Ser 


Ser 


Leu 


Pro 




650 










655 




Pro 


Pro 


Leu 


Leu 


Cys 


Gin 


Met 


Thr 


665 








670 






Met 


Ala 


Val 


Gin Gly 


Ser Arg Glu 










665 








Glu 


Ala 


Trp 


He 
700 


Ser 


Gin 


Thr 


Val 


Ser 


Gly 


Ser 
715 


Pro 


His 


Trp 


Pro 


Thr 

-720 


Glu 


Leu 


Ala 


Lys 


Ala 


Ser Gly Ser 




730 










735 




Leu 


Leu 


Thr 


Leu 


Leu 


He 


Gin 


Ser 


745 










750 







27 



26 



<400> 10 
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cggccaacta cagccaagtg gagctg 26 

<210> 11 

<211> 30 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 



<400> 11 

atggcccggc agctgctcct cccccttgtg 3 0 

<210> 12 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 12 

cggctcccgg gcaggaagta gtgttccg 28 

<210> 13 
<211> 9276 
<212> DNA 

<213> Pichia pastoris 



<400> 13 

agatctaaca tccaaagacg aaaggttgaa 
gtccattctc acacataagt gccaaacgca 
tgcaaacgca ggacctccac tcctcttctc 
agcccagtta ttgggcttga ttggagctcg 
acaccatgac tttattagcc tgtctatcct 
tttccgaatg caacaagctc cgcattacac 
agtgtggggt caaatagttt catgttcccc 
gtcttggaac ctaatatgac aaaagcgtga 
ttgaaatgct aacggccagt tggtcaaaaa 
cttgtttggt actgattgac gaatgctcaa 
ctctatcgct tctgaacccc ggtgcacctg 
ttttggatga ttatgcattg tctccacatt 
gctgatagcc taacgttcat gatcaaaatt 
atataaacag aaggaagctg ccctgtctta 
actttcataa ttgcgactgg ttccaattga 
caacttgaga agatcaaaaa acaactaatt 
tcaattttta ctgcagtttt attcgcagca 
acaacagaag atgaaacggc acaaattccg 
gaaggggatt tcgatgttgc tgttttgcca 
tttataaata ctactattgc cagcattgct 
agagaggctg aagcttacgt agaattccct 
atgactgttc ctcagttcaa gttgggcact 
tcaagaggat gtcagaatgc catttgcctg 
ttatttgtaa cctatatagt ataggatttt 
ttgctcctga tcagcctatc tcgcagctga 
tcattcgagt ttgatgtttt tcttggtatt 
agtgagaagt tcgtttgtgc aagcttatcg 
taaattgcta acgcagtcag gcaccgtgta 
tcggcaccgt caccctggat gctgtaggca 



tgaaaccttt ttgccatccg acatccacag 60 
acaggagggg atacactagc agcagaccgt 120 
ctcaacaccc acttttgcca tcgaaaaacc 180 
ctcattccaa ttccttctat taggctacta 240 
ggcccccctg gcgaggttca tgtttgttta 300 
ccgaacatca ctccagatga gggctttctg 360 
aaatggccca aaactgacag tttaaacgct 420 
tctcatccaa gatgaactaa gtttggttcg 480 
gaaacttcca aaagtcgcca taccgtttgt 540 
aaataatctc attaatgctt agcgcagtct 600 
tgccgaaacg caaatgggga aacacccgct 660 
gtatgcttcc aagattctgg tgggaatact 720 
taactgttct aacccctact tgacagcaat 780 
aacctttttt tttatcatca ttattagctt 840 
caagcttttg attttaacga cttttaacga 900 
attcgaagga tccaaacgat gagatttcct 960 
tcctccgcat tagctgctcc agtcaacact 1020 
gctgaagctg tcatcggtta ctcagattta 1080 
ttttccaaca gcacaaataa cgggttattg 1140 
gctaaagaag aaggggtatc tctcgagaaa 1200 
agggcggccg cgaattaatt cgccttagac 1260 
tacgagaaga ccggtcttgc tagattctaa 1320 
agagatgcag gcttcatttt tgatactttt 1380 
ttttgtcatt ttgtttcttc tcgtacgagc 1440 
tgaatatctt gtggtagggg tttgggaaaa 1500 
tcccactcct cttcagagta cagaagatta 1560 
ataagcttta atgcggtagt ttatcacagt 1620 
tgaaatctaa caatgcgctc atcgt cat cc 1680 
taggcttggt tatgccggta ctgccgggcc 1740 
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tcttgcggga tatcgtccat tccgacagca tcgccagtca ctatggcgtg ctgctagcgc 1800 
tatatgcgtt gatgcaattt ctatgcgcac ccgttctcgg agcactgtcc gaccgctttg 1860 
gccgccgccc agtcctgctc gcttcgctac ttggagccac tatcgactac gcgatcatgg 1920 
cgaccacacc cgtcctgtgg atctatcgaa tctaaatgta agttaaaatc tctaaataat 1980 
taaataagtc ccagtttctc catacgaacc ttaacagcat tgcggtgagc atctagacct 2040 
tcaacagcag ccagatccat cactgcttgg ccaatatgtt tcagtccctc aggagttacg 2100 
tcttgtgaag tgatgaactt ctggaaggtt gcagtgttaa ctccgctgta ttgacgggca 2160 
tatccgtacg ttggcaaagt gtggttggta ccggaggagt aatctccaca actctctgga 222 0 
gagtaggcac caacaaacac agatccagcg tgttgtactt gatcaacata agaagaagca 22 80 
ttctcgattt gcaggatcaa gtgttcagga gcgtactgat tggacatttc caaagcctgc 2340 
tcgtaggttg caaccgatag ggttgtagag tgtgcaatac acttgcgtac aatttcaacc 240 0 
cttggcaact gcacagcttg gttgtgaaca gcatcttcaa ttctggcaag ctccttgtct 2460 
gtcatatcga cagccaacag aatcacctgg gaatcaatac catgttcagc ttgagacaga 2520 
aggtctgagg caacgaaatc tggatcagcg tatttatcag caataactag aacttcagaa 2580 
ggcccagcag gcatgtcaat actacacagg gctgatgtgt cattttgaac catcatcttg 2640 
gcagcagtaa cgaactggtt tcctggacca aatattttgt cacacttagg aacagtttct 27 00 
gttccgtaag ccatagcagc tactgcctgg gcgcctcctg ctagcacgat acacttagca 2760 
ccaaccttgt gggcaacgta gatgacttct ggggtaaggg taccatcctt cttaggtgga 2820 
gatgcaaaaa caatttcttt gcaaccagca actttggcag gaacacccag catcagggaa 2880 
gtggaaggca gaattgcggt tccaccagga atatagaggc caactttctc aataggtctt 2940 
gcaaaacgag agcagactac accagggcaa gtctcaactt gcaacgtctc cgttagttga 30 00 
gcttcatgga atttcctgac gttatctata gagagatcaa tggctctctt aacgttatct 3060 
ggcaattgca taagttcctc tgggaaagga gcttctaaca caggtgtctt caaagcgact 3120 
ccatcaaact tggcagttag ttctaaaagg gctttgtcac cattttgacg aacattgtcg 3180 
acaattggtt tgactaattc cataatctgt tccgttttct ggataggacg acgaagggca 3240 
tcttcaattt cttgtgagga ggccttagaa acgtcaattt tgcacaattc aatacgacct 33 00 
tcagaaggga cttctttagg tttggattct tctttaggtt gttccttggt gtatcctggc 3360 
ttggcatctc ctttccttct agtgaccttt agggacttca tatccaggtt tctctccacc 3420 
tcgtccaacg tcacaccgta cttggcacat ctaactaatg caaaataaaa taagtcagca 3480 
cattcccagg ctatatcttc cttggattta gcttctgcaa gttcatcagc ttcctcccta 3540 
attttagcgt tcaacaaaac ttcgtcgtca aataaccgtt tggtataaga accttctgga 3600 
gcattgctct tacgatccca caaggtggct tccatggctc taagaccctt tgattggcca 3 660 
aaacaggaag tgcgttccaa gtgacagaaa ccaacacctg tttgttcaac cacaaatttc 3720 
aagcagtctc catcacaatc caattcgata cccagcaact tttgagttgc tccagatgta 3780 
gcacctttat accacaaacc gtgacgacga gattggtaga ctccagtttg tgtccttata 3 840 
gcctccggaa tagacttttt ggacgagtac accaggccca acgagtaatt agaagagtca 3900 
gccaccaaag tagtgaatag accatcgggg cggtcagtag tcaaagacgc caacaaaatt 3960 
tcactgacag ggaacttttt gacatcttca gaaagttcgt attcagtagt caattgccga 4020 
gcatcaataa tggggattat accagaagca acagtggaag tcacatctac caactttgcg 4080 
gtctcagaaa aagcataaac agttctacta ccgccattag tgaaactttt caaatcgccc 4140 
agtggagaag aaaaaggcac agcgatacta gcattagcgg gcaaggatgc aactttatca 4200 
accagggtcc tatagataac cctagcgcct gggatcatcc tttggacaac tctttctgcc 4260 
aaatctaggt ccaaaatcac ttcattgata ccattattgt acaacttgag caagttgtcg 4320 
atcagctcct caaattggtc ctctgtaacg gatgactcaa cttgcacatt aacttgaagc 4380 
tcagtcgatt gagtgaactt gatcaggttg tgcagctggt cagcagcata gggaaacacg 4440 
gcttttccta ccaaactcaa ggaattatca aactctgcaa cacttgcgta tgcaggtagc 4500 
aagggaaatg tcatacttga agtcggacag tgagtgtagt cttgagaaat tctgaagccg 4560 
tatttttatt atcagtgagt cagtcatcag gagatcctct acgccggacg catcgtggcc 4620 
gacctgcagg gggggggggg gcgctgaggt ctgcctcgtg aagaaggtgt tgctgactca 4680 
taccaggcct gaatcgcccc atcatccagc cagaaagtga gggagccacg gttgatgaga 4 740 
gctttgttgt aggtggacca gttggtgatt ttgaactttt gctttgccac ggaacggtct 4 800 
gcgttgtcgg gaagatgcgt gatctgatcc ttcaactcag caaaagttcg atttattcaa 4 860 
caaagccgcc gtcccgtcaa gtcagcgtaa tgctctgcca gtgttacaac caattaacca 4 920 
attctgatta gaaaaactca tcgagcatca aatgaaactg caatttattc atatcaggat 4980 
tatcaatacc atatttttga aaaagccgtt tctgtaatga aggagaaaac tcaccgaggc 5040 
agttccatag gatggcaaga tcctggtatc ggtctgcgat tccgactcgt ccaacatcaa 5100 
tacaacctat taatttcccc tcgtcaaaaa taaggttatc aagtgagaaa tcaccatgag 5160 
tgacgactga atccggtgag aatggcaaaa gcttatgcat ttctttccag acttgttcaa 522 0 
caggccagcc attacgctcg tcatcaaaat cactcgcatc aaccaaaccg ttattcattc 5280 
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gtgattgcgc ctgagcgaga cgaaatacgc gatcgctgtt aaaaggacaa ttacaaacag 5340 
gaatcgaatg caaccggcgc aggaacactg ccagcgcatc aacaatattt tcacctgaat 5400 
caggatattc ttctaatacc tggaatgctg ttttcccggg gatcgcagtg gtgagtaacc 5460 
atgcatcatc aggagtacgg ataaaatgct tgatggtcgg aagaggcata aattccgtca 5520 
gccagtttag tctgaccatc tcatctgtaa catcattggc aacgctacct ttgccatgtt 55B0 
tcagaaacaa ctctggcgca tcgggcttcc catacaatcg atagattgtc gcacctgatt 5640 
gcccgacatt atcgcgagcc catttatacc catataaatc agcatccatg ttggaattta 5700 
atcgcggcct cgagcaagac gtttcccgtt gaatatggct cataacaccc cttgtattac 5760 
tgtttatgta agcagacagt tttattgttc atgatgatat atttttatct tgtgcaatgt 5820 
aacatcagag attttgagac acaacgtggc tttccccccc ccccctgcag gtcggcatca 5880 
ccggcgccac aggtgcggtt gctggcgcct atatcgccga catcaccgat ggggaagatc 5940 
gggctcgcca cttcgggctc atgagcgctt gtttcggcgt gggtatggtg gcaggccccg 6000 
tggccggggg actgttgggc gccatctcct tgcatgcacc attccttgcg gcggcggtgc 6060 
tcaacggcct caacctacta ctgggctgct tcctaatgca ggagtcgcat aagggagagc 6120 
gtcgagtatc tatgattgga agtatgggaa tggtgatacc cgcattcttc agtgtcttga 6180 
ggtctcctat cagattatgc ccaactaaag caaccggagg aggagatttc atggtaaatt 6240 
tctctgactt ttggtcatca gtagactcga actgtgagac tatctcggtt atgacagcag 63 00 
aaatgtcctt cttggagaca gtaaatgaag tcccaccaat aaagaaat cc ttgttatcag 6360 
gaacaaactt cttgtttcga actttttcgg tgccttgaac tataaaatgt agagtggata 6420 
tgtcgggtag gaatggagcg ggcaaatgct taccttctgg accttcaaga ggtatgtagg 6480 
gtttgtagat actgatgcca acttcagtga caacgttgct atttcgttca aaccattccg 654 0 
aatccagaga aatcaaagtt gtttgtctac tattgatcca agccagtgcg gtcttgaaac 6600 
tgacaatagt gtgctcgtgt tttgaggtca tctttgtatg aataaatcta gtctttgatc 6660 
taaataatct tgacgagcca aggcgataaa tacccaaatc taaaactctt ttaaaacgtt 672 0 
aaaaggacaa gtatgtctgc ctgtattaaa ccccaaatca gctcgtagtc tgatcctcat 6780 
caacttgagg ggcactatct tgttttagag aaatttgcgg agatgcgata tcgagaaaaa 684 0 
ggtacgctga ttttaaacgt gaaatttatc tcaagatctc tgcctcgcgc gtttcggtga 6900 
tgacggtgaa aacctctgac acatgcagct cccggagacg gtcacagctt gtctgtaagc 6960 
ggatgccggg agcagacaag cccgtcaggg cgcgtcagcg ggtgttggcg ggtgtcgggg 7020 
cgcagccatg acccagtcac gtagcgatag cggagtgtat actggcttaa ctatgcggca 7080 
tcagagcaga ttgtactgag agtgcaccat atgcggtgtg aaataccgca cagatgcgta 7140 
a ggagaaaat accgcatcag gcgctcttcc gcttcctcgc tcactgactc gctgcgct eg 7200 
gtcgttcggc tgeggegage ggtatcagct cactcaaagg eggtaatacg gttatccaca 7260 
gaatcagggg ataaegcagg aaagaacatg tgagcaaaag gecagcaaaa ggecaggaac 7320 
cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc gcccccctga cgagcatcac 7380 
aaaaatcgac gctcaagtca gaggtggcga aacccgacag gactataaag ataccaggcg 7440 
tttccccctg gaagctccct cgtgcgctct cctgttccga ccctgccgct taceggatae 7500 
ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc aatgetcacg. ctgtaggtat 7560 
etcagttegg tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag 7620 
cccgaccgct gcgccttatc eggtaactat cgtcttgagt ccaacccggt aagacacgac 7680 
ttatcgccac tggcagcagc cactggtaac aggattagca gagegaggta tgtaggcggt 7740 
gctacagagt tcttgaagtg gtggcctaac tacggctaca ctagaaggac agtatttggt 7800 
atctgcgctc tgetgaagee agttaccttc ggaaaaagag ttggtagctc ttgatcegge 7860 
aaacaaacca ccgctggtag cggtggtttfc tttgtttgca agcagcagat tacgegcaga 7920 
aaaaaaggat ctcaagaaga tcctttgatc ttttctaegg ggtctgaege tcagtggaac 7980 
gaaaactcac gttaagggat tttggtcatg agattatcaa aaaggatctt cacctagatc 8040 
cttttaaatt aaaaatgaag ttttaaatca atctaaagta tatatgagta aacttggtct 8100 
gacagttacc aatgcttaat cagtgaggca cctatctcag cgatctgtct atttcgttca 8160 
tccatagttg cctgactccc cgtcgtgtag ataactacga tacgggaggg cttaccatct 8220 
ggccccagtg ctgeaatgat accgcgagac ccacgctcac cggctccaga tttatcagca 8280 
ataaaccagc cagceggaag ggccgagcgc agaagtggtc ctgcaacttt atccgcctcc 8340 
atccagtcta ttaattgttg cegggaaget agagtaagta gttcgccagt taatagtttg 8400 
cgcaacgttg ttgccattgc tgeaggcate gtggtgtcac getegtegtt tggtatggct 8460 
tcattcagct ccggttccca acgatcaagg cgagttacat gatcccccat gttgtgcaaa 8520 
aaagcggtta gctccttcgg tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta 8580 
tcactcatgg ttatggcagc actgeataat tctcttactg tcatgccatc cgtaagatgc 8640 
ttttctgtga ctggtgagta ctcaaccaag tcattctgag aatagtgtat gcggcgaccg 8700 
agttgctctt gcccggcgtc aacaegggat aataccgcgc cacatagcag aactttaaaa 876.0 
gtgetcatea ttggaaaacg ttcttcgggg cgaaaactct caaggatctt accgctgttg 8820 
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agatccagtt cgatgtaacc cactcgtgca cccaactgat cttcagcatc ttttactttc 8880 
accagcgttt ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg 8940 
gcgacacgga aatgttgaat actcatactc ttcctttttc aatattattg aagcatttat 9000 
cagggttatt gtctcatgag cggatacata tttgaatgta tttagaaaaa taaacaaata 9060 
ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg tctaagaaac cattattatc 9120 
atgacattaa cctataaaaa taggcgtatc acgaggccct ttcgtcttca agaattaatt 9180 
ctcatgtttg acagcttatc atcgataagc tgactcatgt tggtattgtg aaatagacgc 9240 
agatcgggaa cactgaaaaa taacagttat tattcg ~ ~ 9276 

<210> 14 
<211> 11 
<212> PRT 

<213> Pichia pastoris 
<400> 14 

Lys Arg lie Ala Ser Gly Val lie Ala Pro Lys 
15 10 



<210> 15 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 15 

tgggtcttgg cacctgccag ctgctttctg 

<210> 16 
<211> 28 
<212> DNA 



30 



<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 16 

gaagggggaa gtggtgctgg gaccctag 28 

<210> 17 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 17 

ccctctgggt agccagcaca cagcatc 27 

<210> 18 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
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<400> 18 

gccatcgtgg tgccggccaa ctacag 

<210> 19 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 19 

atcgtggtgc cggccaacta cagccaagtg 

<210> 20 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 20 

acccatcacc tgctcccgta tccatgcctc 

<210> 21 
<211> 103 
<212> PRT 

<213> Artificial Sequence 
<220> 

<213> Homo Sapien 
<400> 21 



Val 


Ser 


Arg 


Leu 


Val 


He 


Ser He 


Arg 


Leu 


Pro 


Gin 


His 


Leu Gly Leu 


1 






5 






10 








15 


Arg 


Pro 


Pro 


Leu 


Ala 


Leu 


Leu Glu 


Leu 


Ser 


Ser Arg 


Val 


Glu Pro Ser 






20 








25 










30 


Pro 


Ser 


Ala 
35 


Leu 


Pro 


He 


Cys Leu 
40 


His 


Pro 


Ala 


Gly 


He 
45 


Pro Pro Gly 


Ala 


Ser 


Cys 


Trp 


Val 


Leu 


Gly Trp 


Lys 


Glu 


Pro. 


Gin Asp Arg Val Pro 




50 










55 








60 






Val 


Ala 


Ala 


Ala 


Val 


Ser 


He Leu 


Thr 


Gin Arg 


He 


Cys 


Asp Cys Leu 


65 










70 








75 






80 


Tyr 


Gin 


Gly 


lie 


Leu 


Pro 


Pro Gly Thr 


Leu 


Cys 


Val 


Leu 


Tyr Ala Glu 






85 








90 








95 


Gly Gin 


Glu 


Asn 


Arg 


Cys 


Glu 




















100 





















<210> 22 
<211> 37 
<212> PRT 

<213> Artificial Sequence 
<220> 

<213> Homo Sapien 
<400> 22 

Asn Asp Ser Arg Trp Ser Leu Leu Cys Gin Glu Glu Gly Thr Trp Phe 
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15 10 15 

Leu Ala Gly lie Arg Asp Phe Pro Ser Gly Cys Leu Arg Pro Arg Ala 

20 25 30 

Phe Phe Pro Leu Gin 



